in reply to Re^4: Simplest Possible Way To Disable Unicode
in thread Simplest Possible Way To Disable Unicode
It doesn't say "Wide character".
Specific error message aside, Perl should never treat a number as a 'wide character' without explicit notification from the programmer that that is his intent.
c:\test>perl -we"print chr( 257 )" | wc -c Wide character in print at -e line 1. 2
I've already pointed out the documentation is wrong.
No! You didn't. Nowhere prior to this post anywhere in this thread.
There is no such thing as Unicode number 0x20000, yet
So, the documentation is wrong! And the implementation is (silently) wrong!
That pretty much covers everything. Unicode support in perl is broken.
In Perl, a character is a number in 0 to UVMAX.
And that bullshit is exactly why it is so broken.
Because &^*&% like you will keep on conflating 'numbers' with 'characters'.
- UVMAX is cpu dependant.
Typically 4294967296 or 18446744073709551616, but with other values possible.
- The term 'character' has no meaning outside of some mapping.
Unless a number can be mapped to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language., it is just a number.
And even when it can be so mapped, until it is mapped, it is still just a number.
And any suggestion otherwise is just so much bullshit.
- And 4294967296, much less 18446744073709551616 cannot be mapped to 'a character' in any known or proposed mapping.
Which makes this:
In Perl [or any language], a character is a number in 0 to UVMAX.
stand out as the total twaddle it is.
Unicode support in Perl is broken. And until people like you stop pretending that it isn't it will stay that way.
Indeed, until those that do, stop trying to pretend that you can transparently handle the abortion that is Unicode, whether retro-fitting an existing language or implementing a new one, the longer it will be before we can evolve some sane semantics for handling MBCSs.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^6: Simplest Possible Way To Disable Unicode
by ikegami (Patriarch) on May 24, 2011 at 07:49 UTC | |
by BrowserUk (Patriarch) on May 24, 2011 at 08:39 UTC | |
by ikegami (Patriarch) on May 24, 2011 at 16:00 UTC | |
Re^6: Simplest Possible Way To Disable Unicode
by tchrist (Pilgrim) on May 24, 2011 at 18:59 UTC | |
by BrowserUk (Patriarch) on May 24, 2011 at 19:27 UTC |