Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Perl should never treat a number as a 'wide character' without explicit notification from the programmer that that is his intent.

Judging by your example, I think you mean you don't want wide character to automatically get encoded to UTF-8. (Correct me if I'm wrong.)

What do you propose instead? I can think of a couple.

  • Dying like syswrite? I'm not sure that's better, but I could easily be convinced.

  • Silently convert the numbers to UTF-8? I definitely want at least a warning if non-bytes is passed to print when warnings are on. I don't care what output it produces. Currently, it also warns when warnings are off. That's not appropriate, but I think that's suppose to change.

  • Silently truncate the high bits? Same reply as previous.

The term 'character' has no meaning outside of some mapping.

Characters have no meaning outside a mapping, but the term does. It's simply the basic unit of a string.

And even when it can be so mapped, until it is mapped, it is still just a number.

I fully agree. That's why I said pack doesn't deal with Unicode. It just deals with numbers. So do chr, ord, substr, index, etc.

Operators that do use mappings are lc, \d in regex patterns, etc.

And 4294967296, much less 18446744073709551616 cannot be mapped to 'a character' in any known or proposed mapping.

No, but 4294967295 is a valid character.

>perl -E"say ord chr 4294967295" 4294967295

Perl uses utf8 (not to be confused with UTF-8), an encoding whose charset consist of 2**72 characters. Only up to UVMAX is supported, though.

Unicode support in Perl is broken.

I'm not going to discuss this because this thread has nothing to do with Unicode.

The OP tried to send non-bytes to a file handle, and you tried to store something larger than a byte in a byte. A warning and dying aren't unwarranted.


In reply to Re^6: Simplest Possible Way To Disable Unicode by ikegami
in thread Simplest Possible Way To Disable Unicode by JapanIsShinto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-23 23:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found