http://qs321.pair.com?node_id=11112594


in reply to Re^3: Lost in encodings
in thread Lost in encodings

For completeness–

perl -Mutf8 -CSD -E 'say length "Kü"' # 2

Replies are listed 'Best First'.
Re^5: Lost in encodings
by LanX (Cardinal) on Feb 07, 2020 at 21:37 UTC
    Yes but only if it's a character-string, i.e. the utf8 flag is set.

    But the OP said the flag is not set.

    edit

    not sure what -CSD means.

    update

    got it perlrun

    The -C flag controls some of the Perl Unicode features.

    As of 5.8.1, the -C can be followed either by a number or a list of option letters. The letters, their numeric values, and effects are as follows; listing the letters is equal to summing the numbers.

    I 1 STDIN is assumed to be in UTF-8 O 2 STDOUT will be in UTF-8 E 4 STDERR will be in UTF-8 S 7 I + O + E i 8 UTF-8 is the default PerlIO layer for input streams o 16 UTF-8 is the default PerlIO layer for output streams D 24 i + o

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      Quite right. Those are the default I/O UTF-8 flags. tchrist included some of them in his recommendations on that monster SO post. Included just as an example of what’s correct. If the length is giving bytes instead of length. It’s broken already and that step, or one before it, is the problem. If the OP included an SSCCE, I’d be more helpful. Possibly… :P