Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Problem while using WWW::Mechanize module for getting html

by Aldebaran (Curate)
on Apr 18, 2020 at 20:51 UTC ( [id://11115745]=note: print w/replies, xml ) Need Help??


in reply to Problem while using WWW::Mechanize module for getting html

So I start my first project using WWW::Mechanize. My goal is to get list of titles and url of bulletin board for given period. To start with, I tried to get html from this site. http://hiphople.com/kboard (It a Korean)

This is interesting as I am working in this same area today. A great tool that WM has is its mech-dump script. It helps to see what you're up against:

$ mech-dump -all http://hiphople.com/kboard >1.kboard.txt $ cat 1.kboard.txt | more --> Headers: Cache-Control: no-cache Connection: close Date: Sat, 18 Apr 2020 19:35:46 GMT Server: nginx Vary: Accept-Encoding Content-Encoding: gzip Content-Type: text/html Expires: Thu, 01 Jan 1970 00:00:01 GMT Client-Date: Sat, 18 Apr 2020 19:35:46 GMT Client-Peer: 1.234.1.230:80 Client-Response-Num: 1 Client-Transfer-Encoding: chunked --> Forms: POST http://hiphople.com/___verify <NONAME>= OK (submit) --> Links: --> Images: $

Pretty sparse for mech-dump...it would seem that there is almost nothing there according to WM. When you set a browser on it, you see that it is loaded with javascript, and that's when this tale changes. You might have a look at your own documentation by imitating this:

$ locate Mechanize.pm /usr/local/share/perl/5.26.1/WWW/Mechanize.pm $ cd /usr/local/share/perl/5.26.1/WWW $ ls Mechanize Mechanize.pm $ cd Mechanize/ $ pwd /usr/local/share/perl/5.26.1/WWW/Mechanize $ ls Chrome Cookbook.pod FAQ.pod Image.pm Pluggable Plugin Chrome.pm Examples.pod GZip.pm Link.pm Pluggable.pm $ perldoc FAQ.pod | more NAME WWW::Mechanize::FAQ - Frequently Asked Questions about WWW::Mechan +ize VERSION version 1.96 How to get help with WWW::Mechanize If your question isn't answered here in the FAQ, please turn to th +e communities at: * StackOverflow <https://stackoverflow.com/questions/tagged/www-mechanize> * #lwp on irc.perl.org * <http://perlmonks.org> * The libwww-perl mailing list at <http://lists.perl.org> JavaScript I have this web page that has JavaScript on it, and my Mech program +doesn't work. That's because WWW::Mechanize doesn't operate on the JavaScript. I +t only $

In my opinion, there's 2 things you need to do in order to get perl to operate on site translation to english. 1) You need to use something that can make heads or tails of javascript like WMC or WMF. To use the site as they intend, you need to log in. Do you have an account with them?

P.S My English is not that great.

You did better than great; you did fine.

Best Wishes,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11115745]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-24 21:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found