http://qs321.pair.com?node_id=906408


in reply to Simplest Possible Way To Disable Unicode

I agree. I also want use bytes; to disable those dumb warnings when I use chr and pack 'C*' on numbers greater than 255.

If I do pack 'C*', 257;, I am explicitly stating that I am packing 8-bit byte date, not f****** "wide characters", and if the numeric value is greater than 8-bits, it should be silently truncated.

If I want to pack wide characters, I can use the U template. I don't need or want those two data types conflated.

More to the point, I think unicode should be explicitly enabled by those that need it, not have to be disabled by those that don't.

IF it were possible for unicode to be used transparently, then it might make some sense to enable it by default, but since it cannot, it doesn't.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Simplest Possible Way To Disable Unicode
by ikegami (Patriarch) on May 24, 2011 at 05:09 UTC

    it should be silently truncated.

    no warnings qw( pack );

    More to the point, I think unicode should be explicitly enabled by those that need it

    You're getting an overflow warning. It has nothing to do with Unicode. In fact, pack and unpack don't use Unicode at all.*

    * — Not even "U" has any understanding of Unicode.

    >perl -wE"say sprintf '%X', unpack 'U', pack 'U', 0x200000" 200000
      no warnings qw( pack );

      So, you'd have us throw away all the useful warnings that pack can emit when I do something wrong in order to disable the stupid warning emitted when it does something wrong. Cool-io. Not.

      You're getting an overflow warning.

      Oh sure. "Wide character" says 'overflow', like super-injunction says right to privacy for all.

      It has nothing to do with Unicode.

      Really? Can you guess where this direct quote " A Unicode character number." comes from?

      I don't give flying fig whether you want to conflate the term 'unicode' with that multiplicitous cock-up of formats that hide behind the moniker 'The Unicode Standard'(*), and can't see that I used the former as a short-hand for 'multi-byte character sets'.

      Which should of course be 'The Multicode Standards:Everything including the (7 different) kitchen sinks'

      * — Not even "U" has any understanding of Unicode. >perl -wE"say sprintf '%X', unpack 'U', pack 'U', 0x200000" 200000

      Wadday'know. If you pack with U and unpack with U you get back what you packed. D'uh. A pointless example of nothing much.

      This is the problem.

      perl -wE"$s=pack 'U*', 257; say length $s; print for unpack 'C*', $s;" 1 257

      That totally devalues the purpose of having two different template characters.

      • one for C   An unsigned char (octet) value.
      • one for U   A Unicode character number.  Encodes to a character in character mode and UTF-8 ... in byte mode.

      That should not happen. And I shouldn't have to state that I don't want it to happen:

      >perl -Mbytes -wE"$s=pack 'U*', 257; say length $s; say for unpack 'C* +', $s;" 2 196 129

      It breaks backward compatibility in the very worst way.

      • Screaming when you are doing nothing wrong.

        Breaking both existing, working code and existing expectations. And causing people to disable important and useful warnings to silence it.

      • And saying nothing at all when it does it wrong thing.

        Just silently breaking previously working, 'best practice' code violating every expectation and rule of change and enhancement.

      The Unicode Standard is a cock-up. And the Perl implementation worse.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        So, you'd have us throw away all the useful warnings

        I'm not sure what other warnings pack 'C' or pack in general can emit. You could submit a patch so that pack overflow warnings are a subclass of pack warnings.

        Oh sure. "Wide character" says 'overflow', like super-injunction says right to privacy for all.

        It doesn't say "Wide character".

        >perl -we"$_ = pack 'C*', 257" Character in 'C' format wrapped in pack at -e line 1.

        It's saying how it handled an overflow.

        Really? Can you guess where this direct quote " A Unicode character number." comes from?

        That's easy, but moot. I've already pointed out the documentation is wrong. There is no such thing as Unicode number 0x20000, yet

        >perl -wE"say sprintf '%X', unpack 'U', pack 'U', 0x200000" 200000

        The docs sometimes assign Unicode semantics to operations where no such semantics exist. "A Unicode character number." should simply be "A character number." In Perl, a character is a number in 0 to UVMAX.

Re^2: Simplest Possible Way To Disable Unicode
by Anonymous Monk on May 24, 2011 at 04:48 UTC
    Also, the open pragma doesn't disable the warnings either
    use open qw' :std IO :bytes '; use open qw' :std IO :raw ';
Re^2: Simplest Possible Way To Disable Unicode
by tchrist (Pilgrim) on May 24, 2011 at 20:15 UTC
    I entirely agree that letting cavalier coding errors slip silently by is a Very Bad Thing.

    The pack function is one of those many built-in functions that is much improved by being wrapped with a fatalizing envelope. Something as simple as this should suffice:

    *CORE::GLOBAL::pack = sub ($@) { use warnings FATAL => "pack"; return CORE::pack(shift(), @_); };
    That will catch a lot of bugs that risk being carelessly ignored.

    Hope this helps.

      Hope this helps.

      Have you heard of chocolate teapots.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.