Someone on clpmisc recently asked about a layer over LWP that
would allow for line-by-line reading from a GET or POST
request. I replied to him (not to the newsgroup) with code
I'd developed. I didn't send it to the NG because I didn't
want to be flamed for reinventing the wheel, and I'm sure
my code is less than desireable.
I'm a bit irritated LWP isn't built to allow line-by-line
reading of the response -- and it is not easily sub-classed,
due to the tremendous amount of code. That's why I had to
come up with what I post below. It also has not been too
rigorously tested.
The use of the module is as follows:
use LWP::FileHandle;
lwpopen URL, $method, $url, $query;
where
$method is either 'GET' or 'POST',
$url
is a FULL URL (like "http://www.server.com/path"), and
$query is a string, array reference, or hash
reference that holds the key-value pairs of the HTTP
request. If you use an array or hash reference, the data
MUST NOT be encoded yet -- if you use a string, the data
MUST be encoded already. Sample usages are:
lwpopen URL, GET => $url, 'this=that';
lwpopen URL, GET => "$url?this=that"; # can append QS to URL
lwpopen URL, POST => $url, [ this => 'that' ];
lwpopen URL, POST => $url, { this => 'that' };
Then you can read from
URL as if it were a regular
filehandle:
use LWP::FileHandle;
lwpopen JAPHY, GET => 'http://www.crusoe.net/~jeffp/';
while (<JAPHY>) {
print;
}
lwpclose JAPHY;
You can turn off the returning of the HTTP response headers
by setting
$LWP::FileHandle::HEADERS to 0. I think
that about covers it for the module... oh, it doesn't handle
redirects. It could be added (a bit more code, but it can
be done).
Is this a bad thing for me to have done/written? I
don't mean to incite a flame war or a cargo cult in my honor
but I felt this functionality warranted creation.
package LWP::FileHandle;
use IO::Socket;
use URI::Escape;
use Socket ();
use Carp;
use strict;
use vars qw( @ISA @EXPORT $HEADERS );
require Exporter;
@ISA = qw( Exporter );
@EXPORT = qw( lwpopen lwpclose );
$HEADERS = 1;
my $CRLF = $Socket::CRLF;
sub lwpopen (*@) {
my ($fh,$mode,$url,$qs1) = @_;
my ($host,$path,$qs2) = $url =~ m!(?:http://)?([^/]+)([^?]*)(.*)!;
my ($query,$socket,$obj);
$mode = uc $mode;
croak "HTTP mode must be 'GET' or 'POST'"
if $mode ne 'GET' and $mode ne 'POST';
if (UNIVERSAL::isa($qs1, 'ARRAY')) {
for (my $i = 0; $i < @$qs1; $i += 2) {
$query .= '&' if length $query;
$query .= join '=',
uri_escape($qs1->[$i]), uri_escape($qs1->[$i+1]);
}
}
elsif (UNIVERSAL::isa($qs1, 'HASH')) {
while (my ($k,$v) = each %$qs1) {
$query .= '&' if length $query;
$query .= join '=', uri_escape($k), uri_escape($v);
}
}
elsif ($qs1 and not ref $qs1) { $query = $qs1 }
elsif ($qs1) {
croak "HTTP request must be array ref, hash ref, or string"
}
$query .= '&' if length $query and length $qs2;
$query .= $qs2;
$query = "?$query" if length $query;
$path ||= '/';
$host .= ':80' if $host !~ /:\d+$/;
$socket = IO::Socket::INET->new($host);
{
no strict 'refs';
tie *$fh, 'LWP::FileHandle::Tie',
$socket, $host, $path, $query, $mode;
}
return $socket ? 1 : 0;
}
sub lwpclose (*) {
no strict 'refs';
untie(*{ $_[0] });
}
package LWP::FileHandle::Tie;
sub TIEHANDLE {
my ($class,$socket,$host,$path,$query,$mode) = @_;
bless {
SOCKET => $socket,
READFROM => 0,
PATH => $path,
QUERY => $query,
MODE => $mode,
}, $class;
}
sub READLINE {
my $socket = $_[0]{SOCKET};
if (!$_[0]{READFROM}++) {
my ($path,$query) = @{$_[0]}{qw( PATH QUERY )};
if ($_[0]{MODE} eq 'GET') {
$socket->print("GET $path$query HTTP/1.0$CRLF$CRLF");
}
else {
my $enctype = "application/x-www-form-urlencoded";
my $len = length $query;
$socket->print("POST $path HTTP/1.0$CRLF");
$socket->print("Content-type: $enctype$CRLF");
$socket->print("Content-length: $len$CRLF$CRLF");
$socket->print($query);
}
if (!$LWP::FileHandle::HEADERS) {
while (<$socket>) { last if $_ eq $CRLF }
}
}
<$socket>;
}
sub DESTROY {
$_[0]{SOCKET}->close;
}
$_="goto+F.print+chop;\n=yhpaj";F1:eval
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.