Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Everything works perfectly as long as my zip file has files named with Latin characters, but things get worse when the names are Chinese or Japanese.

If you can answer a couple of questions, it may give us the information that would allow us to actually help you...

  • When you say "things get worse", what does that mean, exactly? Do you get an error message? Does extractMemberWithoutPaths return an error code? Are the files written at all? Do their filenames get mangled? What do the resulting mangled filenames look like? The phrase "get worse" is kind of vague, so I'm not really sure what's going wrong, and without knowing what's going wrong, it's hard to know how to fix it.
  • Can you, via some other means (say, using a file manager, or on the command line) create files in the location where you are trying to extract these, with CJK characters in their filenames? Not all filesystems support such things, and so without knowing what kind of filesystem your storage device is formatted with, we can't know for sure that it's even theoretically possible for such filenames to be created. Do you know what kind of filesystem it is? ext3? NTFS? HFS+? FAT32? Something else? (If you don't know the answer to this, just telling us what operating system you're using and whether you're saving on your computer's main hard drive or to a USB flash drive or some other location could provide clues.) Update: I just noticed the "c:/somedir" in your code, which I suspect narrows things down a little. NTFS *ought* to be able to handle CJK filenames, I think, although depending on what version of Windows you have it might require that the relevant language options be installed, in the Language thingydoo in the control panel. If you're using a really old Windows (95/98/Me) or for some other reason are using FAT32, then I'm less sure.

Oh, one other thing: the following code works for me (Perl 5.10.1, debian oldstable amd64):

nathan@warthog:~/test2/extract$ ls somefile.zip nathan@warthog:~/test2/extract$ perl -e ' $filename = "somefile.zip"; $dest_dir = "/home/nathan/test2/extract"; use Archive::Zip; my $zip = Archive::Zip->new(); local $Archive::Zip::UNICODE = 1; unless ( $zip->read($filename) == AZ_OK ) { die "Error Reading Zip File !"; } foreach my $m ($zip->members()) { print "Member $m:\n "; my $err = $zip->extractMemberWithoutPaths( $m, "$dest_dir/" . $m->fi +leName); print "Error: $err" if $err; print $/; }' Member Archive::Zip::ZipFileMember=HASH(0xdfdd30): Member Archive::Zip::ZipFileMember=HASH(0xdfe2b8): Member Archive::Zip::ZipFileMember=HASH(0xdfe5a0): Member Archive::Zip::ZipFileMember=HASH(0xdfe888): Member Archive::Zip::ZipFileMember=HASH(0xdfeb98): Member Archive::Zip::ZipFileMember=HASH(0xdfee80): nathan@warthog:~/test2/extract$ ls 한국어 somefile.zip ગુજર&# +2750;તી ಕನ್ನಡ ব&#24 +94;ংলা 中文 日本語 nathan@warthog:~/test2/extract$
(Perlmonks seems unable or perhaps unwilling to handle most of those characters -- and if unwilling I can't blame them; this is by design an English-language venue -- but they display just fine on my terminal when I do the ls. Of course, I created my somefile.zip using the zip program that comes with Debian; yours may have been created using different software...)


In reply to Re: Seeking help with Extracting files from zip by jonadab
in thread Seeking help with Extracting files from zip by aksjain

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2024-04-23 14:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found