You could hang a callback off the request, and die after you've seen 4096 characters. Your script won't actually die, you'll just return back from the request method. Something like this:
use HTTP::Request;
use LWP::UserAgent;
my $html = '';
my $request = HTTP::Request->new(GET => "$url/$file" );
my $ua = LWP::UserAgent->new;
my $response = $ua->request($request, \&cb);
sub cb {
$html .= $_[0];
die if length($html) > 4096;
}
Note that you might want to trim the $html variable back to exactly 4096 chars with substr($html, 0, 4096). Also, you may have asked the question because you know that what you are looking for is somewhere within the first 4k. Of course, if you find what you are looking for earlier, then you can die all that much earlier.
Note that this is about as efficient as it gets; you are getting chunks of the page more or less as they are peeled off the socket and then dealing with them on the fly.
Hmm... I never noticed the $self->{ua} thing before. I'll have to take a closer look at that... or hang on, is that just part of a larger object?
--g r i n d e r
|