Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: By the shine on my bald pate, I dislike this encoding stuff

by haukex (Archbishop)
on Mar 04, 2018 at 11:52 UTC ( [id://1210310]=note: print w/replies, xml ) Need Help??


in reply to By the shine on my bald pate, I dislike this encoding stuff

In addition to the issue poj pointed out with reading from <ORDERFILE> twice (my(@LINES) = <ORDERFILE> reads all lines from the file, so $filedata would normally be empty), I just wanted to point out that the pattern eval {...}; if ($@) {...} has issues and that the pattern eval {...; 1} or do {...} or a module like Try::Tiny is better. Also, nowadays lexical filehandles (open my $fh, ...) are generally preferred over bareword filehandles (open ORDERFILE, ...). (Update: The AM also made a good point that you appear to be decoding the data twice.)

Really, the best way to go is to know in advance what encoding your files are in, and then opening them with the appropriate encoding in open my $fh, '<:encoding(...)', $filename or die $!;

You may want to have a look at The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

If you are sure that your files can only ever be UTF-8 or "ANSI" (by which I'm going to assume you mean Windows-1252, but see also), then you could use the following to guess which of the two the file is encoded in, but be aware this may still get things wrong, e.g. if you also have files encoded in, say, Latin-1 or Latin-9 (see also) this may not throw an error because the those encodings are so similar to CP1252!

use warnings; use strict; use Encode qw/decode/; sub guess_utf8_cp1252 { # WARNING: Does NOT work for other encodings my ($fn) = @_; open my $fh, '<:raw', $fn or die "$fn: $!"; my $raw = do { local $/; <$fh> }; # slurp close $fh; my $decoded; eval { $decoded = decode('UTF-8', $raw, Encode::FB_CROAK ); 1} or eval { $decoded = decode('CP1252', $raw, Encode::FB_CROAK ); 1} or die "$fn: Could not decode"; return $decoded; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1210310]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-16 09:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found