Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Correct Perl settings for sending zipfile to browser

by tobyink (Canon)
on Nov 14, 2019 at 10:11 UTC ( [id://11108666]=note: print w/replies, xml ) Need Help??


in reply to Correct Perl settings for sending zipfile to browser

MacOS (darwin), Linux, and UNIX all have the same "\n" endings, whereas Windows/DOS uses "\r\n". Some code has been added to facilitate this variation.

Firstly, I'd get rid of that. Most Windows software will cope fine with "\n" line endings, including pretty much any spreadsheet or database software you're planning to import the tab-delimited file into. It's only a few really basic tools like Notepad that might not. Once you get your code working, if you decide you really need "\r\n" on Windows, then you can add that code back in, but for now, I'd suggest removing it. Simplifying your code will help you find where your bug is.

For now, I'd also suggest not generating a Content-Length HTTP header; perhaps it is somehow wrong and this is resulting in the browser truncating the file.

Once you've done that, compare the size of the file and the MD5 sum of the file at both the client and server end, and check they match. (Don't unlink the ZIP file at the end, so you can compare them.)

My gut feeling is that you're somehow appending some extra data at the end of the file when you send it. ZIP files are unusual in that they include header-like information at the end of the file instead of the start. This is a throwback to the days when people would write ZIP files that spanned multiple floppy disks, and the header information couldn't be written until the process was finished, so got written onto the last floppy disk. (When you unzipped the file, you needed to insert the last disk first so the header could be read, then start at the beginning and insert them all in order until you got to the last one again which would need to be read again for its non-header data.) So yeah, you've probably got some extra data like an error message or warning or even a few line break characters, at the end of your ZIP.

  • Comment on Re: Correct Perl settings for sending zipfile to browser

Replies are listed 'Best First'.
Re^2: Correct Perl settings for sending zipfile to browser
by Anonymous Monk on Nov 15, 2019 at 01:22 UTC

    Following your advice, I removed the Content-Length header. The file size of the downloaded zip increased. An error message indicated that I should check the binmode of the transferred file, so I changed that to the UTF-8 version. My file size increased again.

    Clearly, there is still some issue with the proper transfer from server to client.

    After learning this, I decided to retry the download a few times to see if the file sizes would be the same. I found the following issues, and made notes of them:

    #File size, (download attempt), unzip's reported missing bytes
    #4,569,935 (1) missing 942842225
    #4,569,765 (2) missing 942842396
    #4,571,674 (3) missing 942840486
    #4,569,765 (4) missing 942842396
    #4,571,656 (5) missing 942840504
    

    As the data indicates, only the second and fourth attempts resulted in identical numbers. Note that no change to the code was made between any of the attempts.

    With errata like this, I'm not sure where to look next.

    Here's a current error message, for comparison.

    $ unzip -v DB_ExportFile_2019-11-14.txt\(4\).zip 
    Archive:  DB_ExportFile_2019-11-14.txt(4).zip
    
    caution:  zipfile comment truncated
    error DB_ExportFile_2019-11-14.txt(4).zip:  missing 942842396 bytes in zipfile
      (attempting to process anyway)
    error DB_ExportFile_2019-11-14.txt(4).zip:  start of central directory not found;
      zipfile corrupt.
      (please check that you have transferred or created the zipfile in the
      appropriate BINARY mode and that you have compiled UnZip properly)
    

    It should also be noted that the original text file, before being zipped, weighs in at about 22 MB -- far short of the number of supposed missing bytes indicated in the error message.

      I wouldn't expect the file sizes to always be perfectly identical anyway. Your SELECT statement doesn't include an ORDER BY, so won't always return rows in the same order, and depending on what order they're returned, this will make subtle differences to how compressible the file is.

      Pretty sure you don't want to be reading the file as UTF-8. It should be raw. Might want to binmode STDOUT to raw too.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11108666]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-20 03:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found