Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

GZip transfer encoding depends on the Client sending an "Accept-Encoding" header in the request which has to contain the string "gzip". (Other compression schemes like bzip2 are also possible).

If the server supports gzip and the client has requested it, the server *may* decide to send the BODY of the response compressed as a gzip stream (depending on things like if the file is compressible and if the server wants to spend CPU resources to reduce network load at this point in time). To do this, it adds a "Content-Encoding" header in the response with the value set to "gzip".

From what i remember, ye olde WWW::Mechanize doesn't send any Accept-Encoding header which is was gets it into trouble sometimes. Let me quote from RFC7231, page 41, Chapter "5.3.4 Accept-Encoding", sub-paragraph 1:

If no Accept-Encoding field is in the request, any content-coding is considered acceptable by the user agent.

Here is the link: https://tools.ietf.org/html/rfc7231#page-41

This is what can get WWW::Mechanize in trouble, because the server MAY decide to use gzip, bzip2 or whatever in the reply. If you use WWW::Mechanize::GZip, which *does* send the correct header, the server is only allowed to either send uncompressed or gzip compressed, and WWW::Mechanize::GZip understands both as far as i remember. It's just the more reliable option.

BTW, when we are talking about Transfer-Encoding, this isn't the same as "file format". So you wont download a .gz file and unzip it. Instead, the content just gets gzipped on the server side for sending over the network, then it gets automatically decompressed by the client library before it gets handed (uncompressed) to the client. This is just to speed up transfer, in practise, your script should not even realize (or bother) that this compression magic is going on in the background to save network bandwith and speed up data transfer.

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

In reply to Re: Problem while using WWW::Mechanize module for getting html by cavac
in thread Problem while using WWW::Mechanize module for getting html by yujong_lee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-25 13:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found