comment on

Someone on clpmisc recently asked about a layer over LWP that would allow for line-by-line reading from a GET or POST request. I replied to him (not to the newsgroup) with code I'd developed. I didn't send it to the NG because I didn't want to be flamed for reinventing the wheel, and I'm sure my code is less than desireable.

I'm a bit irritated LWP isn't built to allow line-by-line reading of the response -- and it is not easily sub-classed, due to the tremendous amount of code. That's why I had to come up with what I post below. It also has not been too rigorously tested.

The use of the module is as follows:

use LWP::FileHandle;
lwpopen URL, $method, $url, $query;
[download]

where $method is either 'GET' or 'POST', $url is a FULL URL (like "http://www.server.com/path"), and $query is a string, array reference, or hash reference that holds the key-value pairs of the HTTP request. If you use an array or hash reference, the data MUST NOT be encoded yet -- if you use a string, the data MUST be encoded already. Sample usages are:

lwpopen URL, GET => $url, 'this=that';
lwpopen URL, GET => "$url?this=that";  # can append QS to URL
lwpopen URL, POST => $url, [ this => 'that' ];
lwpopen URL, POST => $url, { this => 'that' };
[download]

Then you can read from URL as if it were a regular filehandle:

use LWP::FileHandle;
lwpopen JAPHY, GET => 'http://www.crusoe.net/~jeffp/';
while (<JAPHY>) {
  print;
}
lwpclose JAPHY;
[download]

You can turn off the returning of the HTTP response headers by setting $LWP::FileHandle::HEADERS to 0. I think that about covers it for the module... oh, it doesn't handle redirects. It could be added (a bit more code, but it can be done).

Is this a bad thing for me to have done/written? I don't mean to incite a flame war or a cargo cult in my honor but I felt this functionality warranted creation.

package LWP::FileHandle;

use IO::Socket;
use URI::Escape;
use Socket ();
use Carp;
use strict;
use vars qw( @ISA @EXPORT $HEADERS );

require Exporter;

@ISA = qw( Exporter );
@EXPORT = qw( lwpopen lwpclose );

$HEADERS = 1;

my $CRLF = $Socket::CRLF;

sub lwpopen (*@) {
  my ($fh,$mode,$url,$qs1) = @_;
  my ($host,$path,$qs2) = $url =~ m!(?:http://)?([^/]+)([^?]*)(.*)!;
  my ($query,$socket,$obj);

  $mode = uc $mode;
  croak "HTTP mode must be 'GET' or 'POST'"
    if $mode ne 'GET' and $mode ne 'POST';

  if (UNIVERSAL::isa($qs1, 'ARRAY')) {
    for (my $i = 0; $i < @$qs1; $i += 2) {
      $query .= '&' if length $query;
      $query .= join '=',
        uri_escape($qs1->[$i]), uri_escape($qs1->[$i+1]);
    }
  }
  elsif (UNIVERSAL::isa($qs1, 'HASH')) {
    while (my ($k,$v) = each %$qs1) {
      $query .= '&' if length $query;
      $query .= join '=', uri_escape($k), uri_escape($v);
    }
  }
  elsif ($qs1 and not ref $qs1) { $query = $qs1 }
  elsif ($qs1) {
    croak "HTTP request must be array ref, hash ref, or string"
  }

  $query .= '&' if length $query and length $qs2;
  $query .= $qs2;
  $query = "?$query" if length $query;

  $path ||= '/';

  $host .= ':80' if $host !~ /:\d+$/;
  $socket = IO::Socket::INET->new($host);

  {
    no strict 'refs';
    tie *$fh, 'LWP::FileHandle::Tie',
      $socket, $host, $path, $query, $mode;
  }

  return $socket ? 1 : 0;
}


sub lwpclose (*) {
  no strict 'refs';
  untie(*{ $_[0] });
}



package LWP::FileHandle::Tie;

sub TIEHANDLE {
  my ($class,$socket,$host,$path,$query,$mode) = @_;
  bless {
    SOCKET => $socket,
    READFROM => 0,
    PATH => $path,
    QUERY => $query,
    MODE => $mode,
  }, $class;
}


sub READLINE {
  my $socket = $_[0]{SOCKET};
  if (!$_[0]{READFROM}++) {
    my ($path,$query) = @{$_[0]}{qw( PATH QUERY )};
    if ($_[0]{MODE} eq 'GET') {
      $socket->print("GET $path$query HTTP/1.0$CRLF$CRLF");
    }
    else {
      my $enctype = "application/x-www-form-urlencoded";
      my $len = length $query;
      $socket->print("POST $path HTTP/1.0$CRLF");
      $socket->print("Content-type: $enctype$CRLF");
      $socket->print("Content-length: $len$CRLF$CRLF");
      $socket->print($query);
    }

    if (!$LWP::FileHandle::HEADERS) {
      while (<$socket>) { last if $_ eq $CRLF }
    }
  }

  <$socket>;
}


sub DESTROY {
  $_[0]{SOCKET}->close;
}
[download]

$_="goto+F.print+chop;\n=yhpaj";F1:eval

In reply to to post, or not to post... by japhy

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


We don't bite newbies here... much
	PerlMonks