http://qs321.pair.com?node_id=488871

OnionKnight has asked for the wisdom of the Perl Monks concerning the following question:

I want to make some sort of subroutine which parses BB tags but I'm not sure how to do it. Replacing them with html isn't hard but what if someone opens a tag and doesn't close it? (i.e [/B]) I could count the number of opened tags and closed and check if they match and if they don't I'll print an error message, or correct it.
Problem is that regexes return true or false in a scalar context - which I don't want, and list of captured stuff in array context. I could capture all tags, assign them to a list and check it but don't you guys usually rave about capturing being really slow for performance?

Also, I am using >>\d+ to reference to other posters, for example >>1 is a reference to the first poster. But I don't want this substitution (putting it in a <a> tag) to be done inside a [code] tag so how do I do this? I was thinking of doing some look-before/look-ahead thing like /(?<!\[code\])>>\d+(?!\[\\code\])/ but will that work if a user for example were to write:
'[code]print "funny text";[/code] >>4 blargh [code]5>>1 is 2[/code]'

Favorable output would be:
'<span class="code">print &quot;funny text&quot;;</span><a href="#4">&gt;&gt;4</a><span class="code">5&gt;5&gt;1 is 2</span>'

But I suspect that the ">>4" won't get substitued with an <a> tag. (Entity names are already being taken care of with escapeHTML so don't worry about that)

Also, is it possible to get all this done in XHTML (e.g. tags strictly close in a reverse way they're opened) without too much work?

2005-09-03 Retitled by Arunbear, as per Monastery guidelines
Original title: 'BBCode'