Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
It doesn't say "Wide character".

Specific error message aside, Perl should never treat a number as a 'wide character' without explicit notification from the programmer that that is his intent.

c:\test>perl -we"print chr( 257 )" | wc -c Wide character in print at -e line 1. 2
I've already pointed out the documentation is wrong.

No! You didn't. Nowhere prior to this post anywhere in this thread.

There is no such thing as Unicode number 0x20000, yet

So, the documentation is wrong! And the implementation is (silently) wrong!

That pretty much covers everything. Unicode support in perl is broken.

In Perl, a character is a number in 0 to UVMAX.

And that bullshit is exactly why it is so broken.

Because &^*&% like you will keep on conflating 'numbers' with 'characters'.

  1. UVMAX is cpu dependant.

    Typically 4294967296 or 18446744073709551616, but with other values possible.

  2. The term 'character' has no meaning outside of some mapping.

    Unless a number can be mapped to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language., it is just a number.

    And even when it can be so mapped, until it is mapped, it is still just a number.

    And any suggestion otherwise is just so much bullshit.

  3. And 4294967296, much less 18446744073709551616 cannot be mapped to 'a character' in any known or proposed mapping.

    Which makes this:

    In Perl [or any language], a character is a number in 0 to UVMAX.
    stand out as the total twaddle it is.

Unicode support in Perl is broken. And until people like you stop pretending that it isn't it will stay that way.

Indeed, until those that do, stop trying to pretend that you can transparently handle the abortion that is Unicode, whether retro-fitting an existing language or implementing a new one, the longer it will be before we can evolve some sane semantics for handling MBCSs.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^5: Simplest Possible Way To Disable Unicode by BrowserUk
in thread Simplest Possible Way To Disable Unicode by JapanIsShinto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (6)
As of 2024-04-16 07:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found