Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Correct Perl settings for sending zipfile to browser

by cavac (Curate)
on Nov 14, 2019 at 14:42 UTC ( #11108673=note: print w/replies, xml ) Need Help??


in reply to Correct Perl settings for sending zipfile to browser

You seem to be mixing up some stuff in my opinion.

First of all, you don't seem to check the HTTP method, but unlinking (deleting) the file no matter what. It's important to realise that, depending on the browser/client/useragent, it might do multiple requests, for example first a HEAD lookup, then do partial downloads with GET (via range requests). HEAD and GET are idempotent, meaning that multiple requests to the same resource will yield the same result, unless you explicitly give an expiry header that says the browser can't rely on that.

To quote the Wikipedia article on idempotence: Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application.. There is also a nice explanation on stackoverflow, which you should read.

If you re-create the ZIP file for each request, it might change - not only depending on the data from the database, but also on the exact way ZIP is implemented (i'm not sure, but there could be a random element in how your implementation calculates the internal lookup tables). In this way, the partial downloaded parts might or might not match up to make a valid file.

Here is an example header from my own webserver software for a zip file download (which works):

200 OK Cache-Control: no-cache, no-store, must-revalidate Date: Thu, 14 Nov 2019 14:09:02 GMT Accept-Ranges: bytes ETag: b5e50bf7ab874841149553d71570c0ba4b54e940 Server: PAGECAMEL/2.4 Allow: GET, HEAD Content-Language: en Content-Length: 5199736 Content-Type: application/octet-stream Expires: Thu, 14 Nov 2019 14:09:02 GMT Last-Modified: Sat, 03 Aug 2019 22:38:27 GMT

As you can see, i disable caching, have a stable ETag header (basically the checksum of the file) and an Expires header that says "it expires now". The ETag is for when a browsers requests a file again, it can first check with HEAD if the ETag has changed - if it didn't, it doesn't have to download the file again. ALso, on partial downloads, it can check the ETag, and if it has changed can show a proper error message along the lines of "can't continue download because the requested resource has changed".

If you want to implement one-time downloads, you should make sure you implement it in a POST method (and check that the browser has used the correct one), as POST is not idempotent and allows resources on the server to change in response to client action.

You are also using an experimental content type application/x-download which may or may not be supported by the browsers. Try application/octet-stream instead, this will tell the browser "I'm sending you some binary junk, do whatever you want with it".

The Accept-Ranges header is there because i allow range requests. This helps a lot with the download manager integrated into modern browsers, especially on large files.

The Content-Length header is also very important, because it allows Session reuse and allows the browser to verify it got ALL of the data. This must be byte-exact, any error will mess up the download. This includes any stray lineendings at the end of the content.

There are a few other things that you should fix in your code, like using a proper three-argument open.

If you have LWP installed, you could check your headers with GET https://myurl on the command line.

perl -e 'use MIME::Base64; print decode_base64("4pmsIE5ldmVyIGdvbm5hIGdpdmUgeW91IHVwCiAgTmV2ZXIgZ29ubmEgbGV0IHlvdSBkb3duLi4uIOKZqwo=");'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11108673]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2021-01-25 01:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?