Perl should never treat a number as a 'wide character' without explicit notification from the programmer that that is his intent.
Judging by your example, I think you mean you don't want wide character to automatically get encoded to UTF-8. (Correct me if I'm wrong.)
What do you propose instead? I can think of a couple.
-
Dying like syswrite? I'm not sure that's better, but I could easily be convinced.
-
Silently convert the numbers to UTF-8? I definitely want at least a warning if non-bytes is passed to print when warnings are on. I don't care what output it produces. Currently, it also warns when warnings are off. That's not appropriate, but I think that's suppose to change.
-
Silently truncate the high bits? Same reply as previous.
The term 'character' has no meaning outside of some mapping.
Characters have no meaning outside a mapping, but the term does. It's simply the basic unit of a string.
And even when it can be so mapped, until it is mapped, it is still just a number.
I fully agree. That's why I said pack doesn't deal with Unicode. It just deals with numbers. So do chr, ord, substr, index, etc.
Operators that do use mappings are lc, \d in regex patterns, etc.
And 4294967296, much less 18446744073709551616 cannot be mapped to 'a character' in any known or proposed mapping.
No, but 4294967295 is a valid character.
>perl -E"say ord chr 4294967295"
4294967295
Perl uses utf8 (not to be confused with UTF-8), an encoding whose charset consist of 2**72 characters. Only up to UVMAX is supported, though.
Unicode support in Perl is broken.
I'm not going to discuss this because this thread has nothing to do with Unicode.
The OP tried to send non-bytes to a file handle, and you tried to store something larger than a byte in a byte. A warning and dying aren't unwarranted.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.