Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Why does Perl choke on my UTF8 identifier?

by Intrepid (Deacon)
on Mar 29, 2021 at 23:11 UTC ( [id://11130561]=perlquestion: print w/replies, xml ) Need Help??

Intrepid has asked for the wisdom of the Perl Monks concerning the following question:

This must be answered somewhere but SuperSearch didn't find it. I'm using use utf8 in my test program and the program won't compile; the error message is Malformed UTF-8 character: \xf0\x69\x20\x20 (unexpected non-continuation byte 0x69, immediately after start byte 0xf0; need 4 bytes, got 1) at wordy.pl line 6. Here's my simple program:
#!perl use warnings; use strict; use utf8; my $veiði = "exciting"; my $efnisskrá = "repertoir"; use open IO => ':utf8'; print "in my $efnisskrá (efnisskrá) I like to go fishing ($veiði)\n";

Originally posted as a Categorized Question.

Replies are listed 'Best First'.
Re: Why does Perl choke on my UTF8 identifier?
by choroba (Cardinal) on Mar 30, 2021 at 05:51 UTC
    Are you sure you saved your file as utf8? When I copy the source and save it, it doesn't contain the byte \xF0 anywhere. The sequence f069 2020 is present, however, when I convert the file to iso-8859-1:
    iconv -f utf8 -t l1 1.pl | xxd | grep f069.2020 00000030: 6638 3b0a 0a6d 7920 2476 6569 f069 2020 f8;..my $vei.i
Re: Why does Perl choke on my UTF8 identifier?
by kcott (Archbishop) on Mar 30, 2021 at 05:33 UTC

    See the open pragma. From the synopsis:

    # with :std, also affect global standard handles

    Here's what I get when I add :std:

    perl -e ' use warnings; use strict; use utf8; my $veiði = "exciting"; my $efnisskrá = "repertoir"; use open IO => qw{:utf8 :std}; print "in my $efnisskrá (efnisskrá) I like to go fishing ($veiði)\n"; ' in my repertoir (efnisskrá) I like to go fishing (exciting)

Re: Why does Perl choke on my UTF8 identifier?
by Anonymous Monk on Mar 30, 2021 at 13:33 UTC

    It is not enough just to say use utf8;. Your source file must actually be in utf-8 rather than some other encoding. When I download your sample and try to run it, I get the same compile errors you do. But when I edit it with vim and issue the vim :set fileencoding? command, it says fileencoding=latin1. When I :set fileencoding=utf-8 and save the file, the compile errors go away.

    As kcott pointed out, you will also need to correct your use IO ... to get the output you want.

      <meta http-equiv="Content-type" content="text/html;charset=cswindows1252">

      Aha! blaming the editor is correct. Your advice (Thank You!) was spot-on. I use vim too, and indeed, the fileencoding was set to iso-8859. I corrected that and ran the program successfully. I only feel a little under-educated. Effective Perl Programming should have had a note about editors and their codepages and how use utf8 won't dwim if the editor is saving the program text as something else ;-).

      Examine what is said, not who speaks.
      Love the truth but pardon error.
      Silence betokens consent.
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Wishful thinking on my part... aren't we long overdue to have editors default to UTF-8?

        use OldMan::Yelling::AtClouds;

        -bib

Re: Why does Perl choke on my UTF8 identifier?
by Anonymous Monk on Mar 30, 2021 at 07:52 UTC

    my simple program

    sorry thats not it

    if you want to show the bytes of your program post the output of

    use Data::Dump qw/ dd /; use Path::Tiny qw/ path /; dd( path('intrepid.pl')->slurp_raw );

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11130561]
Front-paged by davies
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-28 14:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found