Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

how to open a data file via http

by mwhiting (Beadle)
on Feb 26, 2007 at 20:29 UTC ( [id://602211]=perlquestion: print w/replies, xml ) Need Help??

mwhiting has asked for the wisdom of the Perl Monks concerning the following question:

Hi - I have a program which currently opens a local file, reads it in line by line, and then closes the file (a standard kind of thing to do). I need to know how to open a file over http to do the same thing. Here's my barebones code:
open(DATAFILE, '../libsearch/SAMPLE.TXT'); while (<DATAFILE>) { print $_; } close(DATAFILE);
Instead of the relative filepath, I tried this line:
open(DATAFILE, 'http://www.jaywil.com/libsearch/SAMPLE.TXT');
Didn't work. How can I open a file over http to read it's successive text lines, as if it were local?

Thanks so much!

Replies are listed 'Best First'.
Re: how to open a data file via http
by stonecolddevin (Parson) on Feb 26, 2007 at 20:36 UTC

    Use LWP::Simple to get the file, using getstore, then open it and read it thusly.

    something like this (untested):

    #!/usr/bin/perl -w use strict; use LWP::Simple; my $content = getstore("http://www.jaywil.com/libsearch/SAMPLE.TXT", $ +your_file_name); open(DATAFILE, $yourfilename); # your file manipulation code here
    Where $yourfilename is the name of the file and location of where you want your file to be.

    Hope this helps!

    meh.
Re: how to open a data file via http
by EvanK (Chaplain) on Feb 26, 2007 at 20:57 UTC
    Well, the simplest way would be to use LWP::Simple (as the name implies), or you could use URI::Fetch if you want a bit more control (this actually uses LWP::UserAgent under the hood).
    # get content of the page, or die if get() fails use LWP::Simple; $content = get('http://www.example.com/some/document.txt'); if (defined $content) { print $content; } else { die 'get() failed'; } # get content of page, sending a custom useragent (browser name) use URI::Fetch; $object = URI::Fetch->fetch( 'http://example.com/some/document.txt', UserAgent => 'Perl script' ) or die URI::Fetch->errstr;

    __________
    Systems development is like banging your head against a wall...
    It's usually very painful, but if you're persistent, you'll get through it.

Re: how to open a data file via http
by Anonymous Monk on Feb 26, 2007 at 22:16 UTC
    Didn't work.
    Well you can't just make stuff up and expect it to work :)
    use IO::All; # Let the madness begin... $html < io->http("www.google.com"); # Grab a web page
Re: how to open a data file via http
by mreece (Friar) on Feb 27, 2007 at 03:19 UTC
    another option, open a scalar as a filehandle:
    use LWP::Simple; my $data = get('http://www.jaywil.com/libsearch/SAMPLE.TXT'); open DATAFILE, '<', \$data; while (<DATAFILE>) { print $_; }
Re: how to open a data file via http
by hangon (Deacon) on Feb 27, 2007 at 05:26 UTC

    The http protocol does not directly support what you're trying to do. It's designed to return an entire file specified in an http request, not parts of a file. The easiest solution would be to slurp a copy of the file into a variable. Here are a couple of ways using LWP::Simple.

    use LWP::Simple; # get file into variable my $data = get('http://www.jaywil.com/libsearch/SAMPLE.TXT'); # print all at once print $data; # or break up and print line by line # update: to keep blank lines at eof see comment by ikegami below @lines = split(/\n/, $data); for (@lines){ print "$_\n"; }

    If for some reason you actually need to get the file one line at a time over http, you would need some help on the server side. If someone hasn't already written a module for this, you could write a cgi script that keeps state information and handles a few operations such as *open*, *read_line* and *close*. Basically you would be designing your own mini protocol to run over http.

      You incorrectly defined a file as "a series of lines seperated by newlines", whereas it's really "a series of lines ending in newlines (except possibly the last one)".

      split(/\n/, $data) doesn't work. It removes blank trailing lines.
      split(/\n/, $data, -1) doesn't work either. It adds a blank line.
      split(/^/m, $data, -1) works. Bonus: It doesn't remove the newline!

      The last one can also be written as split(/^/m, $data) and special handling allows split(/^/, $data) to work too.

        split(/\n/, $data) doesn't work. It removes blank trailing lines.

        What is your definition of "doesn't work" here? The dropping of empty trailing fields in the case of the LIMIT parameter being 0 or not specified works as advertised in the split docs.

        split(/\n/, $data, -1) doesn't work either. It adds a blank line.

        Perl doesn't do this on my system:

        >perl -MData::Dumper -e "$_=qq(1\n2\n3); print Dumper split /\n/,$_,-1 +" $VAR1 = '1'; $VAR2 = '2'; $VAR3 = '3'; >perl -MData::Dumper -e "$_=qq(1\n2\n3\n); print Dumper split /\n/,$_, +-1" $VAR1 = '1'; $VAR2 = '2'; $VAR3 = '3'; $VAR4 = '';

        When you read your data from file on the other hand, you have to be careful that your editor does not hide a trailing newline on the display. (E.g. Vim, where you can't tell the difference, other than by an initial [noeol] in the status line.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://602211]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 02:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found