Keep It Simple, Stupid | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Gosh, I wish I even knew the difference between those HTML:: modules and how to put them to work! Given the examples in this thread, soon I'll have more than a hammer to do my scraping. :-)
In the meantime: when I go to the web page from the link via my IE browser and do a Ctl-A and Ctl-C and then paste the text into a Notepad screen, this particular output is quite comprehensible to my HTML-untrained eye (vs the HTML stuff), e.g.
With my regex sledgehammer it would be straightforward to process this data. Oftentimes, when I look at the "pure text" version of a web page there aren't nearly as many nice hooks for sorting things out. But this is THIS case, and my question is: might there be a tool which emulates this action of select/copy/paste of a web page to automate the production of such text for follow-on regex processing? In reply to Re: Scraping HTML: orthodoxy and reality
by ff
|
|