http://qs321.pair.com?node_id=11113758

yujong_lee has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I'm student studying programming. I have experience in programming C and Perl. I know most of syntax and concept of both language. However, I don't have any experience building my own project. and I have only basic knowledge of Networks.

So I start my first project using WWW::Mechanize. My goal is to get list of titles and url of bulletin board for given period. To start with, I tried to get html from this site. http://hiphople.com/kboard (It a Korean)

#!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new( autocheck => 1 ); $mech->get( "http://hiphople.com/kboard" ); print $mech->content();

But the output is ���w�Ʊ0��tN��ӆR#�... WWW::Mechanize use utf-8 as default, and target site's html header said it use utf-8 too. So it's not encoding problem. and I found the fact that target site use gzip.(I found it at the http response header).

To solve this, first I tried to use WWW::Mechanize::Gzip. but the document said "If the webserver does not support gzip-compression, no decompression will be made." and I guess http://hiphople.com/kboard web server does not support gzip-compression. because It doesn't working.

So I tried to decompress it without getting help from webserver. the code below is my attempt to do that.

#!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; use IO::Uncompress::Gunzip qw(gunzip); my $mech = WWW::Mechanize->new( autocheck => 1 ); my $responce = $mech->get( "http://hiphople.com/kboard" ); my $output = "file1.txt"; gunzip $responce => $output;

But the result is .IO::Uncompress::Gunzip::gunzip: illegal input parameter. I guess it's because $response is not .gz format due to mechanize. and that all I can guess. I don"t know what to do.

So this is what I have encountered during getting simple html file from site that I want. Now I need some help from other people. Getting a html file is the first step of my project and it was hard to achieve. Can anyone help me?

P.S My English is not that great. So I'm afraid it was difficult for you to read. Sorry for that.