http://qs321.pair.com?node_id=665585

kansaschuck has asked for the wisdom of the Perl Monks concerning the following question:

I've got Perl 5.10.0 on my Toshiba laptop. Laptop is broadband connect to the net. I'd like to run a scheduled Perl task that would grab a simple .txt file from a website. A gentle push, examples, suggested code, hints, would be welcome. I'm new here. Hi Everyone.
  • Comment on Perl on Windows XP needs to grab internet txt file.

Replies are listed 'Best First'.
Re: Perl on Windows XP needs to grab internet txt file.
by grinder (Bishop) on Feb 01, 2008 at 17:06 UTC

    LWP::Simple is about the easiest way, although you forego fine-grained error checking. The basic recipe goes like this:

    use strict; use warnings; use LWP::Simple; my $local = time . ".txt"; my $url = "http://www.example.com/"; my $page = get($url) or die "failed!\n"; open my $out, '>', $local or die "Cannot open $local for output: $!\n" +; print $out $page; close $out;

    You can then wrap that up in a cmd file:

    @echo off c: cd \path\to\program\dir perl get-page.pl

    And run that as a scheduled task. Make sure the scheduled task service is enabled and running. You might want to think of a better name for your local file.

    • another intruder with the mooring in the heart of the Perl

Re: Perl on Windows XP needs to grab internet txt file.
by dwm042 (Priest) on Feb 01, 2008 at 15:53 UTC
    In this case, the O' Reilly book Spidering Hacks is a terrific resource for mechanically working with the web, and the module WWW::Mechanize is your friend.

Re: Perl on Windows XP needs to grab internet txt file.
by rgiskard (Hermit) on Feb 01, 2008 at 17:07 UTC
    If you don't want to buy a book, or borrow, or what-have-you; you can browse around the Tutorials Section, there's a lot of stuff there for beginners.

    At a glance, there's a tutorial on LWP

Re: Perl on Windows XP needs to grab internet txt file.
by DigitalKitty (Parson) on Feb 02, 2008 at 00:37 UTC
    Ahoy kansaschuck!

    LWP::Simple is an extremely useful module and I highly recommend using it for your http related needs. To retrieve a file from a remote server, the following code would work well:
    #!/usr/bin/perl use warnings; use strict; use LWP::Simple; my $url = ''; my $file = ''; my $resp = ''; print 'Please enter a URL: '; chomp( $url = <STDIN> ); print 'Please enter a filename in which to store the data: '; chomp( $file = <STDIN> ); # We don't want to overwrite a file that already exists so using the ' +-e' essentially asks: Does a file of this name # already exist? If so, the message is displayed and the program exits +. if( -e $file ) { die "Sorry. That filename already exists.\n"; } # The 'getstore' function returns an http response code e.g. 200, 404, + etc. # The syntax says: Store the data obtained from '$url' in '$file'. $resp = getstore( $url, $file ); exit;


    Hope this helps,

    ~Katie
Re: Perl on Windows XP needs to grab internet txt file.
by Jim (Curate) on Feb 02, 2008 at 23:17 UTC
    Consider using LWP::Simple::mirror for your specific task. There's a good example of its use in the article about LWP::Simple on the 22nd day of the 2003 Perl Advent Calendar.

    Here's my own example:

    #!C:/Perl/bin/perl.exe use strict; use warnings; use LWP::Simple qw( mirror is_error RC_NOT_MODIFIED ); my $url = 'http://rabbit.eng.miami.edu/dics/pocket.txt'; my $file = $url; $file =~ s{.*/}{}; my $status = mirror($url, $file); if (is_error($status)) { die "Can't get text file at URL $url\n"; } if ($status == RC_NOT_MODIFIED) { warn "File $file not modified\n"; } # Do something with freshened file named pocket.txt... exit 0;
    See 14.13 Content-Length and 14.25 If-Modified-Since for the gory details of what's going on under the hood.

    Cheers!

    Jim

      Thanks all! I went with the LWP Perl modules to access remote files. And that worked very well. ' thanks, kc
Re: Perl on Windows XP needs to grab internet txt file.
by Starky (Chaplain) on Feb 05, 2008 at 15:12 UTC
    The above comments are sound advice. However, if your task is as simple as you describe it, the simplest solution may be to just set up a cron job to call wget.

    If you use Windows, I believe (but am not 100% sure) that Start -> Control Panel -> Scheduled Tasks is the cron equivalent and you can find wget for Windows here.