Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Yet Another Scraping Question

by izut (Chaplain)
on Apr 18, 2006 at 09:20 UTC ( [id://544020]=note: print w/replies, xml ) Need Help??


in reply to Yet Another Scraping Question

Have you inspected the contents of the loaded page? You can perform a md5sum on its contents, if it is the same of the last viewed page, you're done.

Igor 'izut' Sutton
your code, your rules.

Replies are listed 'Best First'.
Re^2: Yet Another Scraping Question
by Cody Pendant (Prior) on Apr 19, 2006 at 05:15 UTC
    Nice idea, but it would fail if I happened to have two different timestamps, wouldn't it?

    I could however do the checksum on the HTML table which forms the bulk of the page instead of the whole page.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print

      That's right. I think performing a checksum at HTML table would be enough.

      Igor 'izut' Sutton
      your code, your rules.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://544020]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-25 23:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found