Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

How would I find a piece of html from a sourcefile.html?

by Corry (Initiate)
on May 26, 2000 at 14:01 UTC ( [id://14941]=perlquestion: print w/replies, xml ) Need Help??

Corry has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I want to grep a piece of code from a html source file somewhere in the middle of it. The wanted snippet is situated between comment tags. Can anyone help me getting started? (I ordered the book "regular expressions" from O'reilly. Till it arrives, I hope you guys can help.) thnx in advance, Corry.

Originally posted as a Categorized Question.

  • Comment on How would I find a piece of html from a sourcefile.html?

Replies are listed 'Best First'.
Re: How would I find a piece of html from a sourcefile.html?
by dsb (Chaplain) on Jan 24, 2001 at 02:54 UTC
    This regular expression will grap the entire tag, brackets and all.
    $data =~ m/(<[^>]+>)/; print $1, "\n"; # print the tag
    This regex will leave out the brackets and print only the string inside them:
    $data =~ m/<([^>]+)>/; print $1, "\n"; # print the tag
    Use modifiers, or loops as you need too. -kel
Re: How would I find a piece of html from a sourcefile.html?
by athomason (Curate) on May 26, 2000 at 14:27 UTC
    Comments make HTML extraction even more difficult than it usually is. However, if you're dealing with fairly standard HTML you could use
    $page =~ /<!--\w+(.*)\w*-->;/; $commented = $1;
    This will grab the string inside the comment; add appropriate qualifiers to the regexp as necessary (or use another on $commented) if you only want to pick certain stuff out.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://14941]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2024-04-25 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found