http://qs321.pair.com?node_id=332293


in reply to Using a Regex to extract tagged content

Greetings Anonymous,

Also, I know i could use some sort of XML module for this, but I'd rather do it with regex.

Why? Why go to the trouble of making an incomplete regex that will eventually fail instead of just using a CPAN module? I would strongly recommend you read up about parsers like HTML::TokeParser. You will save yourself a lot of heartache. As a general rule, CPAN is always better than trying to do it yourself. Always.

I'm not sure exactly what you want to pull from your content, but here's a basic example to get you going:

use HTML::TokeParser; my ($type, $mesg); my $page = HTML::TokeParser->new(\$content); while (my $token = $page->get_tag('msg')) { $type = $token->[1]{dest}; $mesg = $token->[3]; }

gryphon
code('Perl') || die;