in reply to Re^3: getting rid of UTF-8
in thread getting rid of UTF-8
I'll try to get something together and paste a hex dump. But: i know that there are nothing but plain lower 128 ASCII characters {I just mentioned ISO-latin out of habit}. It is all data that I entered in and there's no data in the CSV files that isn't something I entered. I have no idea why there's a bom in the middle of the first record..... I'll get the dump
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^5: getting rid of UTF-8
by BernieC (Pilgrim) on Nov 25, 2022 at 03:25 UTC | |
And here's the hex dump of it Notice, from the dump that there another EFBBBF toward the end of the file. And: I tried to brute force it and it didn't work!! I did the and it didn't remove the characters! I'll try again... | [reply] [d/l] [select] |
by haukex (Archbishop) on Nov 25, 2022 at 09:43 UTC | |
I did the $line =~ s/\xef\xbb\xbf// and it didn't remove the characters! Using the advice from kcott here to use /g, it works for me. If it really doesn't work for you, then perhaps the data you have in your Perl string is not what you think it is. See my node here for advice on how to show us the real data, in particular Devel::Peek, and make sure to provide an SSCCE that we can run to see the problem for ourselves. | [reply] [d/l] [select] |
by BernieC (Pilgrim) on Nov 25, 2022 at 15:08 UTC | |
and when I run it on one of teh BOM'ed files I get: What am I getting wrong/missing? | [reply] [d/l] [select] |
by haukex (Archbishop) on Nov 25, 2022 at 16:09 UTC | |
by BernieC (Pilgrim) on Nov 25, 2022 at 17:14 UTC | |
by Anonymous Monk on Nov 25, 2022 at 16:19 UTC |
In Section
Seekers of Perl Wisdom