http://qs321.pair.com?node_id=651153

sylvanus_monachus has asked for the wisdom of the Perl Monks concerning the following question:

Here is a strange perl bug

After Eval a var with an accent , other evals don't work... Have you ever see this, and who i need to alert of this perl bug ??

The bug

$string should be "new eval done" and it is undef...

The Code

my $code1='my $var_with_é_accent;'; print "Eval buggy code\n"; eval $code1; print "ERR = $@" if ($@);; print "DONE\n"; my $string; my $code2='$string="new eval done";'; print "Eval good code\n"; eval $code2; print "ERR = $@" if ($@); print "string is : $string\n"; print "DONE\n";

Action

i have reported a perl bug (47517)

Replies are listed 'Best First'.
Re: Bug : eval and accent
by ikegami (Patriarch) on Nov 16, 2007 at 10:06 UTC

    Replicated the problem (Active Perl 5.8.8, WinXP) and it's definitely a bug.

    The code does get executed. Changing
    my $code2='$string="new eval done";';
    to
    my $code2='$string="new eval done"; print "!"';
    prints an exclamation point. Perl just doesn't seem to access $string properly.

    The bug doesn't manifest itself if you change
    my $code1='my $var_with_é_accent;';
    to
    my $code1='$var_with_é_accent;';

    By the way, that's a iso-latin-1 encoded é, "\xE9".

Re: Bug : eval and accent
by angiehope (Pilgrim) on Nov 16, 2007 at 10:37 UTC
    Hi! Same results here (Perl 5.8.8 on Slackware linux 12.0).
    perl -w 651153.pl Eval buggy code ERR = Unrecognized character \xE9 at (eval 1) line 1. DONE Eval good code Use of uninitialized value in concatenation (.) or string at 651153.pl + line 15. string is : DONE
    Since the file was encoded in ISO-8859, I converted it to utf-8, but with similar results. Then I added "use utf8;" as first line - and the script worked:
    Eval buggy code DONE Eval good code string is : new eval done DONE
    So, if anyone wants to use non-ascii characters for variable names, you should save the script as utf8 text file and include "use utf8;" Have a nice day!

      The bug is not "accent in variable" The bug is that in second eval the perl lexer can't parse normal variable anymore...Eval is useful to execute buggy code ... that's why i use it, because the code comes from user and he can always put "Unrecognized character"

      And your are rigth , the way to solve the bug "accent in variable" is this way :

      use Unicode::String; my $code1='my $var_with_é_accent=5;print "var is $var_with_é_accent";' +; $code1="use utf8;$code1"; $code1=Unicode::String::latin1($code1)->utf8; print "\nEval buggy code\n"; eval "$code1"; print "ERR = $@" if ($@);; print "DONE\n"; my $string; my $code2='$string="new eval done";'; print "\nEval good code\n"; eval $code2; print "ERR = $@" if ($@); print "string is : $string\n"; print "DONE\n";

      The bug exists with use utf8 too.

      use strict; use warnings; use Encode qw( encode ); use HTML::Entities qw( decode_entities ); # This eval should have no effect on subsequent evals. eval encode('utf-8', decode_entities( 'use utf8; my $var_with_♥_non_word;' )); my $string = 'abc'; eval '$string = "def"; 1' or die; if ($string ne 'def') { print("BUG! string should be 'def' but is '$string'\n"); } else { print("No bug\n"); }
      BUG! string should be 'def' but is 'abc'
Re: Bug : eval and accent
by jbert (Priest) on Nov 16, 2007 at 11:38 UTC
    Leaving aside the eval bug, if you wanted to make this code work, that's what the use utf8 pragma is for (permitting utf8 chars in the script source). It's lexically scoped, so you'll want it in each eval, so you could prepend it to the source code.
    #!/usr/bin/perl use strict; use warnings; use utf8; my $var_with_é_accent = 10; print "var is $var_with_é_accent\n";
    works fine on my perl 5.8.8 system. Maybe this helps you in the short term?
      your code does not work for me on "Perl ActiveState v5.8.8 built for MSWin32-x86-multi-thread OS Windows XP"
      Im getting Error "Unrecognized character \xE9 at Untitled line 6"
      I wounder if this is an ActiveState issue.
        That error would arise if the file containing the script is not written/stored as utf8 data -- the accented character would need to be a two byte sequence 0xc3 0xa9 in order to be the utf8 version of é.

        0xE9 is the cp1252/iso-8859-1 (single-byte) version, and having that form of the character in a script file that also has "use utf8" is a problem with the file, not with perl.

        Hmm...are you sure the file in in utf8? E9 is latin1 for e-acute.

        Can you dump the hex values and check? (or just try code below).

        If your users are entering latin-1, then you could use the Encode module to shift the script to utf8.

        #!/usr/bin/perl use strict; use warnings; use Encode; use utf8; my $code = <<"EOLATINCODE"; my \$var_with_\xE9_accent = 10; print \"var is \$var_with_\xE9_accent\\n\"; EOLATINCODE $code = decode('latin1', $code); print "CODE is $code\n"; eval $code; if ($@) { print "DIED: $@\n"; }
        You don't seem to need 'use utf8' in the eval'd code in this case, but you seem to need it in the containing script.
        your script must be in utf8, it is latin1 currently. Convert the script to utf8 and retry.
        Boris
Re: Bug : eval and accent
by SFLEX (Chaplain) on Nov 16, 2007 at 11:34 UTC
    This could be a bug in Perl or your not using the correct syntax to handle the variable.
    Here is my fix to your code.
    #!/usr/bin/perl my $code1='my $var_with_é_accent;'; print "Eval buggy code\n"; eval {$code1;}; # Should see "Useless use of private variable in void +context at Untitled line 6." print "ERR = $@" if ($@);; # Error doesnt show here print "DONE\n"; my $string; my $code2='$string="new eval done";'; print "Eval good code\n"; eval $code2; print "ERR = $@" if ($@); print "string is : $string\n"; # This Works print "DONE\n";


    Good Luck ^^
      You're aiming at the wrong thing, see Re^2: Bug : eval and accent. Moreover, the two evals are definitively not the same, see the docs. You can't eval what's inside a string using the block form; the warning you see is the same that you would see simply writing:
      $code1;
      without the eval, i.e. using a private variable in a void context. Try perl -we 'my $x; $x' (quoting may change if you're in Windows) on the command line to see it.

      As sylvanus_monachus points out, the bug is in the second eval, which is executed (see ikegami's Re: Bug : eval and accent) but not correctly.

      In any case, I'd surely avoid eval-ing code that comes from the outside world... unless it's me ;)

      Flavio
      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Io ho capito... ma tu che hai detto?
Bug : eval and accent
by sylvanus_monachus (Novice) on Nov 16, 2007 at 09:42 UTC
      Hm, this is a strange one. While the accented variable name is important, the 'my' declaration is also important.

      eval qq/my \$v\xE9/; eval q/print $string; $string = "new value"/; print $@ if $@;
      Results in the error:
      "my" variable $string masks earlier declaration in same scope at (eval 2) line 1.

      Somehow $string is being declared lexical in the second eval. The specific error being triggered in the first eval (Unrecognized character \xE9) has a special case in toke.c, and I can only think this early exit is leaving the tokenizer in an odd state.

      You should report this bug using the perlbug program you should have installed on your system. If you wish to include the details I've mentioned please feel free.

        Thanks monks for your responses... i report a perl bug soon...
      Don't need eval.

      use strict; use warnings; my $var_with_é_accent;
      Gives the error:

      Unrecognized character \xE9 at line 4.

      WinXP SP2, ActivePerl 5.8.8 (build 817).

      I guess you could argue that the error "breaks" the parser, and that it shouldn't really propagate outside of an eval.

      Do the Perl docs actually say you can have non-ascii variable names?

      -David

        All of this is on 5.8.3, ActiveState build 809

        Yes, the bug is that an error within eval should not affect other evals. I'm not sure where the problem lies, because the following code still works as expected and raises no error, no matter whether $string is a global or a lexical variable:

        use strict; use Test::More tests => 12; sub second_eval_unaffected($) { my ($code) = @_; eval $code; diag $@ if $@; my $string = ''; ok eval(q($string="new eval done";)), "Eval after >>$code<< still +works"; is $@, '', "... and raises no error either."; is $string, 'new eval done', "... and sets the variable correctly" +; }; second_eval_unaffected 'my $var_without_accent;'; second_eval_unaffected 'my $var_with = "é_accent";'; second_eval_unaffected 'my statement_with_error;'; second_eval_unaffected 'my $var_with_é_accent";';

        This program exhibits the problem as a self-contained Test::More program. The first sequence of steps leads to a failure, while the second run succeeds. This should not happen, but I don't know why :)

        use strict; use Test::More tests => 10; sub gauntlet { my $code1=shift; my $string; my $expected = "new eval done"; my $code2='$string="new eval done";'; ok eval($code2), "Good code evals correctly"; is $@, '', "... and raises no error on its own."; diag "Eval trial code\n"; diag $code1; eval $code1; diag '$@ is ', $@; undef $string; diag "Eval good code ($code2)\n"; ok eval $code2; is $@, '', "No error raised"; is $string, $expected; }; gauntlet('my $var_with_é_accent;'); gauntlet('my $var_without_accent;');
        The error is properly caught by the first eval. That's not the problem. The problem is that the error causes subsequent evals to work incorrectly (silently unable to access lexicals, at least).

        i have made a new post with renders element...

        in fact we eval code feeded by user ... i know that variables are ascii in perl but my users don't ... and ours application need to do more eval after evaluating buggy code of user...