Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Using a Regex to extract tagged content

by gryphon (Abbot)
on Feb 27, 2004 at 16:27 UTC ( [id://332293]=note: print w/replies, xml ) Need Help??


in reply to Using a Regex to extract tagged content

Greetings Anonymous,

Also, I know i could use some sort of XML module for this, but I'd rather do it with regex.

Why? Why go to the trouble of making an incomplete regex that will eventually fail instead of just using a CPAN module? I would strongly recommend you read up about parsers like HTML::TokeParser. You will save yourself a lot of heartache. As a general rule, CPAN is always better than trying to do it yourself. Always.

I'm not sure exactly what you want to pull from your content, but here's a basic example to get you going:

use HTML::TokeParser; my ($type, $mesg); my $page = HTML::TokeParser->new(\$content); while (my $token = $page->get_tag('msg')) { $type = $token->[1]{dest}; $mesg = $token->[3]; }

gryphon
code('Perl') || die;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://332293]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-19 17:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found