http://qs321.pair.com?node_id=1195199


in reply to Re: Encoding horridness
in thread Encoding horridness

No, I haven't but based on your suggestion and on others comments (thanks to all), I will now do that. Incidentally, to answer questions from others, some of the data in the file is coming from a mySQL database and having checked some of the fields are in UTF and some are latin1 so maybe that is the problem (although I believe you are right - my service provider should give more feedback and I am going to badger them to do this). Other values are just coming from the script itself. I read that PERL internally uses UTF-8 format. So doesn't that mean that all data values unless sourced direct from the database are UTF-8 and therefore my latin1 encoded XML should never have worked? Or is it just that I was probably lucky as latin1 is 'almost' a subset of UTF-8?

Replies are listed 'Best First'.
Re^3: Encoding horridness
by hippo (Bishop) on Jul 16, 2017 at 09:56 UTC
    I read that PERL internally uses UTF-8 format.

    Where did you read that? Certainly not from perlunitut which says (my emphasis):

    Perl has an internal format, an encoding that it uses to encode text strings so it can store them in memory. All text strings are in this internal format. In fact, text strings are never in any other format!

    You shouldn't worry about what this format is, because conversion is automatically done when you decode or encode.