comment on

Ha! What would you say now, ikegami? When people like James Keenan and Ovid don't understand how this stuff works... can less experienced programmers even hope to ever get this right?

So the string containing guillemets (Ťť) is valid UTF-8, but the resulting JSON is not. What am I missing? The `utf8` pragma is correctly marking my source.

JSON::XS says:

(encode_json) Converts the given Perl data structure to a UTF-8 encoded, binary string (that is, the string contains octets only). Croaks on error.

Test::utf8 says:

(is_sane_utf8) This test fails if the string contains something that looks like it might be dodgy utf8, i.e. containing something that looks like the multi-byte sequence for a latin-1 character but perl hasn't been instructed to treat as such... This test fails when... The string contains utf8 byte sequences and the string hasn't been flagged as utf8 (this normally means that you got it from an external source like a C library;

Apparently it tests whether the string was properly decoded... (I'm not familiar with it). I guess you need to Encode::decode_utf8 it, before feeding the string to the second is_sane_utf8 (Test::utf8 has an example, with Encode::_utf8_on)

In reply to Re: JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255 by Anonymous Monk
in thread JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255 by Ovid

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Your skill will accomplish what the force of many cannot
	PerlMonks