Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^4: treat files with umlauts (utf)

by hazylife (Monk)
on Apr 01, 2014 at 12:54 UTC ( [id://1080550]=note: print w/replies, xml ) Need Help??


in reply to Re^3: treat files with umlauts (utf)
in thread treat files with umlauts (utf)

The OP does not say that $scandir contains umlauts.
Neither my code nor the code the OP posted requires the utf8 pragma.
True, but the OP did mention "a SLES Linux with UTF-8 support switched on", and, strictly speaking, use utf8 is not the only means of getting $scandir flagged as UTF-8:
binmode(STDIN, ':encoding(UTF-8)'); chomp(my $scandir = <STDIN>); #OR $ perl -CS -e 'chomp(my $scandir = <STDIN>); ...' #OR $ perl -CA -e 'my $scandir = shift; ...' <dir>

Replies are listed 'Best First'.
Re^5: treat files with umlauts (utf)
by kcott (Archbishop) on Apr 01, 2014 at 13:25 UTC

    I really don't think you understand what the utf8 pragma does. Here's another quote from the documentation (first line of the description):

    "The use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope ..."

    It has nothing to do with the:

    • flagging variables ("getting $scandir flagged as UTF-8")
    • encoding of STDIN, STDOUT or STDERR ("perl -CS ...")
    • encoding of @ARGV elements ("perl -CA ...")

    Let me reiterate the quote I provided in my earlier post from the utf8 documentation:

    "Do not use this pragma for anything else than telling Perl that your script is written in UTF-8."

    It has nothing to do with data read into the script, data processed by the script, data generated by the script or data output by the script. It's only about the text used to write the script and how Perl should parse that source text.

    -- Ken

      It has nothing to do with the: encoding of STDIN ... encoding of @ARGV elements
      Correct.
      It's only about the text used to write the script and how Perl should parse that source text.
      Yes, so...
      use utf8; my $scandir = 'something with umlauts it it'; # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ # ...is this string literal not part of the source?
      It has nothing to do with flagging variables
      #!/usr/bin/perl
      
      use strict;
      use Devel::Peek;
      
      {
          use utf8;
          my $var = 'für';
          print Dump \$var;
      }
      
      my $var = 'für';
      print Dump \$var;
      
        use utf8; my $scandir = 'something with umlauts it it';

        That's exactly the same code you invented, five nodes back, in your original post in this thread: "Re^2: treat files with umlauts (utf)". It is not code the OP posted (or even described in his narrative). My response is unchanged.

        # ...is this string literal not part of the source?

        That string literal is only part of the source you've invented.

        ... use Devel::Peek; ...

        Posting code without explaining why you're doing so is not particularly helpful.

        If you're referring to the output from that containing:

        FLAGS = (PADMY,POK,pPOK,UTF8)

        Then the UTF8 part of that is caused by the umlaut in 'für'. But, the OP's posted code contains no umlauts. Only your invented code contains umlauts.

        Change 'für' to 'fur', and you'll get:

        FLAGS = (PADMY,POK,pPOK)

        Just like the OP's posted code, this does not contain any umlauts and there's no UTF8 in the output.

        You can keep inventing code that requires use utf8 all you want but the OP's posted code contains no umlauts (or any other characters) that require use utf8.

        Please be very clear on these points:

        • The OP's posted code does not contain umlauts.
        • The OP's posted code does not include an assignment to $scandir.
        • the OP's posted code does not require use utf8;.

        -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1080550]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-25 11:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found