Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re^3: POD style regex for inline HTML elements

by Loops (Curate)
on Nov 07, 2014 at 10:56 UTC ( #1106474=note: print w/replies, xml ) Need Help??

in reply to Re^2: POD style regex for inline HTML elements
in thread POD style regex for inline HTML elements

Hi Aleena,

The extract_* functions are meant to operate on the start of a string, not from an arbitrary point. As mentioned in the Text::Balanced description, you may skip a prefix before the start of the balanced text, but by default this will only skip whitespace.

So if you were to change text to:

my $text = ' <bold>, I<italic>, and B<I<bold and italic>> text.';

Your output would be:

$VAR1 = [ '<bold>', ', I<italic>, and B<I<bold and italic>> text.', ' ' ];

Where the return is a triple of the bracketed text, the remaining string, and the prefix that was bypassed before the bracketed text was found.

If you leave your $text input as it was in your example but change the function call to consider everything preceding a < as a prefix:

my @line = extract_bracketed($text, '<>', qr(.*?(?=<)));
You'll get:
$VAR1 = [ '<bold>', ', I<italic>, and B<I<bold and italic>> text.', 'A line with B' ];

Where the prefix is again everything before the <. but includes the bold code at the end, which you'd have to deal with appropriately.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1106474]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2022-05-29 12:13 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (101 votes). Check out past polls.