Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: Using a Regex to extract tagged content

by gryphon (Abbot)
on Feb 27, 2004 at 16:27 UTC ( #332293=note: print w/replies, xml ) Need Help??

in reply to Using a Regex to extract tagged content

Greetings Anonymous,

Also, I know i could use some sort of XML module for this, but I'd rather do it with regex.

Why? Why go to the trouble of making an incomplete regex that will eventually fail instead of just using a CPAN module? I would strongly recommend you read up about parsers like HTML::TokeParser. You will save yourself a lot of heartache. As a general rule, CPAN is always better than trying to do it yourself. Always.

I'm not sure exactly what you want to pull from your content, but here's a basic example to get you going:

use HTML::TokeParser; my ($type, $mesg); my $page = HTML::TokeParser->new(\$content); while (my $token = $page->get_tag('msg')) { $type = $token->[1]{dest}; $mesg = $token->[3]; }

code('Perl') || die;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://332293]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2020-11-30 06:10 GMT
Find Nodes?
    Voting Booth?

    No recent polls found