Re^3: Working with source of returned web page


Perl Monk, Perl Meditation
	PerlMonks

Re^3: Working with source of returned web page

by Popcorn Dave (Abbot)

on Jun 10, 2008 at 22:48 UTC ( [id://691353]=note: print w/replies, xml )

Need Help??

in reply to Re^2: Working with source of returned web page
in thread Working with source of returned web page

It's been a while since I used that module but if I recall correctly, it parses everything in to a token and the tokens not defined as an HTML tag should be defined as a text token.

Take a look at HTML::TokeParser help - parsing headlines and you'll see a quick program I wrote to dump an HTML page to tokenized output. Run that on your page and I think you'll see you don't need to do the regex per se, but rather need to check text tokens to find what you're after.

Good luck!

Update: Changed link from scratchpad to node as per suggestion by ww

Revolution. Today, 3 O'Clock. Meet behind the monkey bars.

I would love to change the world, but they won't give me the source code

Comment on Re^3: Working with source of returned web page

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://691353]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others wandering the Monastery: (6)

As of 2024-04-18 19:22 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found