Re: Determining Content-Length when there is no Content-Length header

First of all you probably don't actually need to find out the exact content length...you just need to know if certain urls contain data over a certain size threshold. you'll need to decide what is the acceptable threshold, and... instead of using the higher level HTTP functions, use sockets to read url data up to maximum size limit. whilst you're reading this into your buffer, you should be able to parse any content-length header that may come along. so if content-length header is present, you can decide to stop reading or keep going to read full file....and if there's no content-length header, continue reading up to your set threshold for entire length. hope this makes sense. btw i think it's possible to a server to lie about content-length and get away with it.

the hardest line to type correctly is: stty erase ^H

Comment on Re: Determining Content-Length when there is no Content-Length header

Replies are listed 'Best First'.
Re^2: Determining Content-Length when there is no Content-Length header by jae_63 (Beadle) on Apr 14, 2011 at 15:51 UTC
OK, this is a very old thread, but I looked at this thread when searching for some information on a related problem, and now that I've solved it I think it should be posted here since Googling "Perl CURLOPT_RANGE" doesn't currently return any useful hits. OK, the bottom line is that if you want to fetch a piece of a remote file using Perl you can take the WWW:Curl package http://search.cpan.org/~szbalint/WWW-Curl-4.15/lib/WWW/Curl.pm and modify the first example to include the lines `my $firstbyte = 50; my $lastbyte = 100; $curl->setopt(CURLOPT_RANGE,"$firstbyte-$lastbyte");` [download] So the OP could use this technique to see whether, e.g. he's able to successfully fetch the 1,000,000th byte of a remote file. If he can fetch it, then he might decide not to try to download that file. I hope that this info is useful to someone.	[reply] [d/l]
Re^3: Determining Content-Length when there is no Content-Length header by afoken (Chancellor) on Apr 14, 2011 at 16:47 UTC
Nice idea, but not all web servers / web applications support byte ranges. I think the proper behaviour for a web server is to ignore the unknown / unsupported header and send the entire resource -- which is clearly not what the OP wanted. See also Re: Determining Content-Length when there is no Content-Length header Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]


There's more than one way to do things
	PerlMonks