http://qs321.pair.com?node_id=881497


in reply to Re^5: How to reverse a (Unicode) string
in thread How to reverse a (Unicode) string

No, actually, I'm not confused. When the term was introduced, it was given as the reason iso-8859-1 works without being decoded, so he indeed meant an identity mapping.
  • Comment on Re^6: How to reverse a (Unicode) string

Replies are listed 'Best First'.
Re^7: How to reverse a (Unicode) string
by JavaFan (Canon) on Jan 10, 2011 at 16:29 UTC
    You have to always decode. Note that Unicode is a list of integers with a meaning. iso-8859-1 is an encoding (of a subset of Unicode). UTF-8 is also an encoding. UTF-16 is another. It just happens that for the first 128 code points, the encoding in iso-8859-1 and UTF-8 are identical. But that wasn't part of Juerds claim.

      You have to always decode.

      No, you don't have to with US-ASCII and iso-8859-1.

      But that wasn't part of Juerds claim.

      I agree. He didn't mention any relation between the first 128 characters of iso-8859-1 and UTF-8. No idea why you bring this up.

      iso-8859-1 is an encoding (of a subset of Unicode)

      Unicode is a character set, not an encoding, so that sentence is broken.

      iso-8859-1 is both a character set and an encoding. The iso-8859-1 character set is a subset of the Unicode character character set, but this property does NOT explain why iso-8859-1 works without being decoded.

        No, you don't have to with US-ASCII and iso-8859-1.
        Sure you do. It's not a difficult encoding, but it still is an encoding.
        Unicode is a character set, not an encoding, so that sentence is broken.
        Wait. You are saying that a sentence of the form "X is an encoding of Y" is broken in English is Y isn't an encoding?

        I guess that "UTF-8 is an encoding of Unicode" is equally broken. For the reason that Unicode isn't an encoding in that sentence either.