http://qs321.pair.com?node_id=1114014

sam_bakki has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

While downloading data from HTTPS URL, I see different results in Net::SLL and IO::Socket::SSL. Basically, IO::Socket::SSL is not downloading full data.

To show whats really happening, I have two scripts below,

One uses the Net::SSL and downloads data properly from Server
Other uses IO::Socket::SSL and downloads only first chunk (I think) from server and quits.

To show the differences b/w downloads, I have shown MD5 sum and file sizes.

My environment
OS: Windows 7 , x86_64 bit Perl: Active Perl , perl 5, version 20, subversion 1 (v5.20.1) built for MSWin32-x86-multi-thread-64int

Note: I saw the same behavior in Active Perl 5.10, 5.14, 5.16 and 5.18

Script 1 - Using Net::SSL and Crypt::SSLeay - Working

#WORKING HTTPS DOWNLOAD Using Net::SSL in Windows + Active Perl use strict; use warnings; use Crypt::SSLeay; use Net::SSL; use WWW::Mechanize; use HTTP::Cookies; use HTTP::Message; use Digest::MD5; use File::Slurp; use Data::Dumper; use Devel::ModuleDumper; #Globals $|=1; #Force LWP to use Net::SSL instead of IO::Socket::SSL $ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS} = "Net::SSL"; $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; delete $ENV{https_proxy} if exists $ENV{https_proxy}; delete $ENV{http_proxy} if exists $ENV{http_proxy}; #Variables my $browser = ""; my $url = 'https://developer.apple.com/standards/qtff-2001.pdf'; my $pageContent = ''; my $fileName = ''; my $md5Obj = Digest::MD5->new(); print "\n USING Net::SSL"; #Init Mechanize $browser = WWW::Mechanize->new(autocheck =>1, noproxy=>1, ssl_opts => +{ 'verify_hostname' => 0 }); # Add cookie jar $browser->cookie_jar(HTTP::Cookies->new()); $browser->agent_alias( 'Linux Mozilla'); $browser->add_header('Accept-Encoding'=>scalar HTTP::Message::decodabl +e()); $browser->timeout(120); #Get URL $browser->get($url); if ($browser->success()) { print "\n INFO: Got URL: $url"; $fileName = $browser->response()->filename(); print "\n INFO: Save in File: $fileName"; $browser->save_content($fileName); #Calculate MD5 sum $pageContent = read_file( $fileName, binmode => ':raw' ); print "\n INFO: $fileName Size: ", length($pageContent)/1024," KB" +; $md5Obj->add($pageContent); print "\n INFO: $fileName MD5 Sum: ", $md5Obj->hexdigest(); undef $md5Obj; } else { print "\n ERROR: Can't get URL $url ",$browser->status(); } print "\n\n INFO: ********************* DUMP ********************"; print "\n",Dumper(\$browser); print "\n INFO: ********************* DUMP ********************"; exit 0;

Output1:


  USING Net::SSL
 INFO: Got URL: https://developer.apple.com/standards/qtff-2001.pdf
 INFO: Save in File: qtff-2001.pdf
 INFO: qtff-2001.pdf Size: 5465.48046875 KB
 INFO: qtff-2001.pdf MD5 Sum: d1aee95cc06d529e67b707257a5cf3eb

Loaded Modules
-------------------
Carp	1.3301
Compress::Raw::Bzip2	2.068
Compress::Raw::Zlib	2.068
Compress::Zlib	2.068
Crypt::SSLeay	0.72
Crypt::SSLeay::CTX	none
Crypt::SSLeay::MainContext	none
Crypt::SSLeay::X509	none
Data::Dumper	2.154
Digest::base	1.16
Digest::MD5	2.53
Encode	2.67
Encode::Alias	2.18
Encode::Config	2.05
Encode::Encoding	2.07
Errno	1.2003
Exporter	5.70
Exporter::Heavy	5.70
Fcntl	1.11
File::Glob	1.23
File::GlobMapper	1.000
File::Slurp	9999.19
HTML::Entities	3.69
HTML::Form	6.03
HTML::Parser	3.71
HTML::PullParser	3.57
HTML::Tagset	3.20
HTML::TokeParser	3.69
HTTP::Config	6.00
HTTP::Cookies	6.01
HTTP::Cookies::Netscape	6.00
HTTP::Date	6.02
HTTP::Headers	6.05
HTTP::Headers::Util	6.03
HTTP::Message	6.06
HTTP::Request	6.00
HTTP::Request::Common	6.04
HTTP::Response	6.04
HTTP::Status	6.03
IO	1.31
IO::Compress::Adapter::Deflate	2.068
IO::Compress::Base	2.068
IO::Compress::Base::Common	2.068
IO::Compress::Gzip	2.068
IO::Compress::Gzip::Constants	2.068
IO::Compress::RawDeflate	2.068
IO::Compress::Zlib::Constants	2.068
IO::Compress::Zlib::Extra	2.068
IO::File	1.16
IO::Handle	1.35
IO::Seekable	1.1
IO::Socket	1.37
IO::Socket::INET	1.35
IO::Socket::IP	0.35
IO::Socket::UNIX	1.26
IO::Uncompress::Adapter::Bunzip2	2.068
IO::Uncompress::Adapter::Inflate	2.068
IO::Uncompress::Base	2.068
IO::Uncompress::Bunzip2	2.068
IO::Uncompress::Gunzip	2.068
IO::Uncompress::Inflate	2.068
IO::Uncompress::RawInflate	2.068
List::Util	1.41
LWP	6.08
LWP::MemberMixin	none
LWP::Protocol	6.06
LWP::Protocol::http	none
LWP::Protocol::https	6.06
LWP::UserAgent	6.06
MIME::Base64	3.14
Net::HTTP	6.07
Net::HTTP::Methods	6.07
Net::HTTPS	6.04
Net::SSL	2.86
POSIX	1.38_03
Scalar::Util	1.41
SelectSaver	1.02
Socket	2.016
Storable	2.51
Symbol	1.07
Tie::Hash	1.05
Time::Local	1.2300
URI	1.65
URI::Escape	3.31
URI::http	none
URI::https	none
URI::_generic	none
URI::_query	none
URI::_server	none
WWW::Mechanize	1.73

Script 2 - Using IO::Socket::SSL - Not Working. Only part of the PDF file is downloaded

#NOT WORKING HTTPS DOWNLOAD Using IO::Socket::SSL in Windows + Active +Perl use strict; use warnings; use IO::Socket::SSL; use WWW::Mechanize; use HTTP::Cookies; use HTTP::Message; use Digest::MD5; use File::Slurp; use Data::Dumper; use Devel::ModuleDumper; #Globals $|=1; $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; #Variables my $browser = ""; my $url = 'https://developer.apple.com/standards/qtff-2001.pdf'; my $pageContent = ''; my $fileName = ''; my $md5Obj = Digest::MD5->new(); print "\n USING IO::Socket::SSL"; #Init Mechanize $browser = WWW::Mechanize->new(autocheck =>1, noproxy=>1,ssl_opts => { + 'verify_hostname' => 0 }); # Add cookie jar $browser->cookie_jar(HTTP::Cookies->new()); $browser->agent_alias( 'Linux Mozilla'); $browser->add_header('Accept-Encoding'=>scalar HTTP::Message::decodabl +e()); $browser->timeout(120); #Get URL $browser->get($url); if ($browser->success()) { print "\n INFO: Got URL: $url"; $fileName = $browser->response()->filename(); print "\n INFO: Save in File: $fileName"; $browser->save_content($fileName); #Calculate MD5 sum $pageContent = read_file( $fileName, binmode => ':raw' ); print "\n INFO: $fileName Size: ", length($pageContent)/1024," KB" +; $md5Obj->add($pageContent); print "\n INFO: $fileName MD5 Sum: ", $md5Obj->hexdigest(); undef $md5Obj; } else { print "\n ERROR: Can't get URL $url ",$browser->status(); } print "\n\n INFO: ********************* DUMP ********************"; print "\n",Dumper(\$browser); print "\n INFO: ********************* DUMP ********************"; exit 0;

Output2:


  USING IO::Socket::SSL
 INFO: Got URL: https://developer.apple.com/standards/qtff-2001.pdf
 INFO: Save in File: qtff-2001.pdf
 INFO: qtff-2001.pdf Size: 6.66796875 KB
 INFO: qtff-2001.pdf MD5 Sum: 4049c364f7332790c3abe548d6a4297c

Loaded Modules
----------------
ActivePerl::Config      none
Carp    1.3301
Compress::Raw::Bzip2    2.068
Compress::Raw::Zlib     2.068
Compress::Zlib  2.068
Cwd     3.48
Data::Dumper    2.154
Digest::base    1.16
Digest::MD5     2.53
Encode  2.67
Encode::Alias   2.18
Encode::Byte    2.04
Encode::Config  2.05
Encode::Encoding        2.07
Encode::Locale  1.03
Errno   1.2003
Exporter        5.70
Exporter::Heavy 5.70
Fcntl   1.11
File::Basename  2.85
File::Glob      1.23
File::GlobMapper        1.000
File::Slurp     9999.19
File::Spec      3.48
File::Spec::Unix        3.48
File::Spec::Win32       3.48
HTML::Entities  3.69
HTML::Form      6.03
HTML::Parser    3.71
HTML::PullParser        3.57
HTML::Tagset    3.20
HTML::TokeParser        3.69
HTTP::Config    6.00
HTTP::Cookies   6.01
HTTP::Cookies::Netscape 6.00
HTTP::Date      6.02
HTTP::Headers   6.05
HTTP::Headers::Util     6.03
HTTP::Message   6.06
HTTP::Request   6.00
HTTP::Request::Common   6.04
HTTP::Response  6.04
HTTP::Status    6.03
IO      1.31
IO::Compress::Adapter::Deflate  2.068
IO::Compress::Base      2.068
IO::Compress::Base::Common      2.068
IO::Compress::Gzip      2.068
IO::Compress::Gzip::Constants   2.068
IO::Compress::RawDeflate        2.068
IO::Compress::Zlib::Constants   2.068
IO::Compress::Zlib::Extra       2.068
IO::File        1.16
IO::Handle      1.35
IO::Seekable    1.1
IO::Socket      1.37
IO::Socket::INET        1.35
IO::Socket::IP  0.35
IO::Socket::SSL 2.010
IO::Socket::SSL::PublicSuffix   none
IO::Socket::UNIX        1.26
IO::Uncompress::Adapter::Bunzip2        2.068
IO::Uncompress::Adapter::Inflate        2.068
IO::Uncompress::Base    2.068
IO::Uncompress::Bunzip2 2.068
IO::Uncompress::Gunzip  2.068
IO::Uncompress::Inflate 2.068
IO::Uncompress::RawInflate      2.068
List::Util      1.41
LWP     6.08
LWP::MemberMixin        none
LWP::Protocol   6.06
LWP::Protocol::http     none
LWP::Protocol::https    6.06
LWP::UserAgent  6.06
Mozilla::CA     20141217
Net::HTTP       6.07
Net::HTTP::Methods      6.07
Net::HTTPS      6.04
Net::SSLeay     1.66
POSIX   1.38_03
Scalar::Util    1.41
SelectSaver     1.02
Socket  2.016
Socket6 0.25
Storable        2.51
Symbol  1.07
Tie::Hash       1.05
Time::Local     1.2300
URI     1.65
URI::Escape     3.31
URI::http       none
URI::https      none
URI::_generic   none
URI::_idna      none
URI::_punycode  1.65
URI::_query     none
URI::_server    none
Win32::API      0.79
Win32::API::Struct      0.65
Win32::API::Type        0.69
Win32::Console  0.10
WWW::Mechanize  1.73

I did not paste the Dumper output because its huge and not properly copied to browser because of the binary contents.

Q: Why IO::Socket::SSL is not downloading full data? What more should I need to do in Script 2.

Update: Added Module versions

Update1: I have tested the Script2 in Linux Fedora 21, x64, Perl 5.18, It's is working fine :). So this looks like only problem in Windows + ActiveState Perl :(

Thanks & Regards,
Bakkiaraj M
My Perl Gtk2 technology demo project - http://code.google.com/p/saaral-soft-search-spider/ , contributions are welcome.