in reply to Re^2: RT::Client turns occasional binary characters in to wide characters in thread RT::Client turns occasional binary characters in to wide characters
This is another reason why is_utf8 is a trap. It does not indicate the string is "in UTF-8". It is an internal flag that describes how Perl is internally storing the string. utf8::upgrade and utf8::downgrade enable and disable this flag respectively without any change to the string (as used in Perl code) (as long as the string can be represented in your native encoding, otherwise utf8::downgrade will croak). So in fact, the only sure thing you can determine from is_utf8 is that every Perl string with codepoints above U+FF *must* have it enabled (but not the other way around).
|