Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: In HTML , I Want to process only Data and Not tags

by lorn (Monk)
on Jul 25, 2006 at 20:25 UTC ( [id://563626]=note: print w/replies, xml ) Need Help??


in reply to In HTML , I Want to process only Data and Not tags

if (i understanded what you sayed){
you need to see this page:
http://www.stonehenge.com/merlyn/LinuxMag/col49.html
}

Lorn
-http://lornlab.org
-slackwarezine.com.br

Code tags added by GrandFather

  • Comment on Re: In HTML , I Want to process only Data and Not tags

Replies are listed 'Best First'.
Re^2: In HTML , I Want to process only Data and Not tags
by duckyd (Hermit) on Jul 25, 2006 at 22:05 UTC
    I think you mean to suggest something like:
    s/>[^>]+</.../
    But in general it's not a good idea to try to roll your own HTML or XML parsing solution when there are plenty of good ones out there.
Re: In HTML , I Want to process only Data and Not tags
by n00dles (Novice) on Jul 25, 2006 at 23:57 UTC

      I'm sure it does. But what does it work for? As shown it is a match that doesn't capture anything and will match a < at the start of a line, followed by anything at all for as much as it can manage, until it finds a >. For example, all the following match:

      '<>' '<tag>' "< line of quoted text in an email using '<' instead of the more usual + '>'" '<tag>the stuff OP wanted to retreive</tag>'

      note that what is matched isn't even what OP wants to retreive. OP was after element data - the bit between a start tag and a end tag.

      BTW, the regex matches the whole last sample line, not just the start tag as you might have expected: .* is greedy.


      DWIM is Perl's answer to Gödel

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://563626]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-25 19:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found