No such thing as a small change | |
PerlMonks |
same prob. different approachby g00n (Hermit) |
on Mar 25, 2004 at 07:31 UTC ( [id://339659]=note: print w/replies, xml ) | Need Help?? |
at the mercy of change
My problem is I want data. Not *pretty web pages*. Raw data in feed format that I can process. I'm pretty much getting the results you are looking for now but not beating my head around having to parse html with all it's problems: namely you open to the mercy of web designers whim to change the layout. use rdf, rss or pda feedsSo I avoid HTML. I'm lazy. I look for the rss, rdf, pda html pages. Point my spider and dump them in a directory for later parsing. Most news sites have rss feeds (though my local newspaper, The Age supplies rss feeds for a fee. but produces a lite page for pda's.) so some parsing is necessary. Now suppose I want to parse a page (in Perl) why wouldn't I use Andy Lesters fine WWW::Mechanise? (WWW::Mechanise article). questions, questions, devils advocate I'm not actually knocking the idea.
now you may say, goon your an idiot, be quiet. but ...
esr ~ <a href="http://catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ar01s02.html"> The Cathedral and the Bazaar v3.0</a>.
In Section
Seekers of Perl Wisdom
|
|