in reply to Malformed UTF-8 character
When I go to ?abspart=1;part=1;displaytype=displaycode;node_id=11148406 and SaveAs, and open it in Notepad++, Notepad++ sees the encoding as "ANSI" (which on my system is "Windows-1252"); when I run that, it gives me "Malformed UTF-8 character" error, because it's the single byte 0x96 but the use utf8 line has told the interpreter that the file should be interpreted as UTF-8... and UTF-8 doesn't have a single-byte 0x96. If I copy the contents manually from the browser, and instead paste into a new file in Notepad++ (which defaults to UTF-8 for me) and save it and run, it runs just fine. Alternatively, if I comment out use utf8 on the downloaded version, it also works.
The problem is that the perlmonks website serves the pages as Content-Type: text/plain; charset=ISO-8859-1 (even though, technically, – is at codepoint 0x96 in Windows-1252, but not in ISO-8859-1, where 0x96 is a control character), so any bytes that get saved use that encoding; but saying use utf8 tells perl to interpret bytes in the source code as UTF-8 -- so it tries to interpret the ISO-8850-1 or Windows-1252 bytes as UTF-8, and fails on codepoints above 127.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Malformed UTF-8 character
by BillKSmith (Monsignor) on Dec 01, 2022 at 19:09 UTC | |
by pryrt (Abbot) on Dec 01, 2022 at 19:44 UTC | |
by BillKSmith (Monsignor) on Dec 02, 2022 at 16:25 UTC | |
by choroba (Cardinal) on Dec 02, 2022 at 23:22 UTC | |
by soonix (Canon) on Dec 02, 2022 at 09:04 UTC | |
by pryrt (Abbot) on Dec 02, 2022 at 15:34 UTC | |
by BillKSmith (Monsignor) on Dec 02, 2022 at 17:08 UTC | |
by kcott (Archbishop) on Dec 03, 2022 at 04:45 UTC | |
by BillKSmith (Monsignor) on Dec 03, 2022 at 13:39 UTC |
In Section
Seekers of Perl Wisdom