|Pathologically Eclectic Rubbish Lister|
JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255by Ovid (Cardinal)
|on Dec 06, 2014 at 20:31 UTC||Need Help??|
Ovid has asked for the wisdom of the Perl Monks concerning the following question:
Update: Perlmonks seems to have trouble rendering some of this. The question is also at stackoverflow.
I'm getting some corrupted JSON and I've reduced it down to this test case.
And this is the output:
So the string containing guillemets («») is valid UTF-8, but the resulting JSON is not. What am I missing? The `utf8` pragma is correctly marking my source. Further, that trailing 187 is from the diag. That's less than 255, so it almost looks like a variant of the old Unicode bug in Perl. (And the test output still looks like crap. Never could quite get that right with Test::Builder).
Switching to `JSON::PP` produces the same output.
Further testing reveals the failure for all characters in range 127 to 255.
This is Perl 5.18.1 running on OS X Yosemite.