Re: web page update notifier
by b10m (Vicar) on Jun 17, 2004 at 21:04 UTC
|
In this snippet, you actually download the whole file each time you (cron) run(s) the script. Wouldn't it be nicer if you'd just ask for a HEAD and check the "Last-Modified" header and do some local testing on that?
$ HEAD http://www.server.tld/page.htm | grep "Last-Modified"
--
b10m
All code is usually tested, but rarely trusted.
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
Re: web page update notifier
by rob_au (Abbot) on Jun 18, 2004 at 07:13 UTC
|
I've posted something similar previously on this site with the node Scripted Actions upon Page Changes which may additionally be of interest (albeit it is somewhat dated now). This code differs in that it employs the last-modified-header or, where this is unavailable, a message digest of the page, in order to determine page changes.
perl -le "print unpack'N', pack'B32', '00000000000000000000001011100100'"
| [reply] [Watch: Dir/Any] |
Re: web page update notifier
by ihb (Deacon) on Jun 18, 2004 at 01:25 UTC
|
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
In short, my point is that when sharing it with other monks I'd be happier to see a portable snippet since it doesn't require much work to make it that. Of course, it's better to share a non-portable snippet than not share at all; that's why I said "happier" and not "happy".
It's not worth the trouble for you when you use it, but since this post isn't targeted to you I just figured it would be nice if you patched it so that more could benefit from it. Just as you'd do with any CPAN module you publish.
ihb
| [reply] [Watch: Dir/Any] [d/l] |
Re: web page update notifier
by zby (Vicar) on Jun 18, 2004 at 08:36 UTC
|
In my spare time I am developing a more complicated notifier with a web interface. The additional feature is that it let's you add some regexps to ignore some changes (it is usefull for pages that for example show current date somewhere). I plan it to evolve into something like what RSS does by extracting what is new on the page (with a kind of HTML diff). You can read some documentation for that, download it or try it on my home server at Active Bookmarks Manual.
I wanted to use it as a replacement for Personal Nodelet - so it has a special (undocumented) feature that links to Perl Monks are internally converted to links to appriopriate The Pen pages.
By the way most current web browsers can notify you about changes to pages in your bookmarks. | [reply] [Watch: Dir/Any] |
|
By the way most current web browsers can notify you about changes to pages in your bookmarks.
I don't just want to know that it changed, I want to know exactly which lines were added and removed. There are numerous scripts that do something like this, but creating a new one is MUCH easier than reading manuals of other scripts, because they're all bloated with features I don't need right now.
| [reply] [Watch: Dir/Any] |
Re: web page update notifier
by danielcid (Scribe) on Jul 09, 2004 at 15:31 UTC
|
You could use a md5 hash to check if the file has modified or not. It is much more accurate and safe than only using a diff. In addition, using the MD5 hash will make the storage for the file much smaller..
*someone said to use "HEAD", to check the last modified date. This value is not safe/trustworthy.
[]'s
-DBC
| [reply] [Watch: Dir/Any] |
|
It is much more accurate and safe than
only using a diff.
Accuracy is irrelevant for text documents. Either a line is the same, or it is not. Besides that, I'm especially interested in *which* lines are different, and how they changed. diff tells me exactly that.
the last modified
date. This value is not safe/trustworthy.
It has proven to be worthy of my trust.
| [reply] [Watch: Dir/Any] |