http://qs321.pair.com?node_id=884800

andreas1234567 has asked for the wisdom of the Perl Monks concerning the following question:

Note: See also bug #83364 (RESOLVED) filed towards perl core.

I have a perl daemon that in principle (unless stopped or restarted) will run forever. From time to time it crashes inexplicably after having emitted errors like shown below:

Variable "$self" will not stay shared at (eval 999973501) line 15. Variable "$self" will not stay shared at (eval 999975201) line 15. Global symbol "$i" requires explicit package name at (eval 999977609) +line 42.
I notice the eval counter (if such a thing exists) in each case seems to be very close to 10**9. Is there a hard coded limit somewhere in the perl interpreter that might explain this behavior, or should I look in the mirror for root causes?

Update 1: Similar symptoms previously described here.

Update 2: I manage to reproduce the problem using a slightly altered verison of code from this node (thanks Jeppe). Note that the problem only manifests after running almost 4 hours on my Intel(R) Core(TM)2 Quad CPU Q9450 @2.66GHz, (perl v5.10.0, x86_64-linux, Linux 2.6.18) :

use strict; use warnings; use Inline C => q{ int evalseq() { return PL_evalseq; }}; use Log::Log4perl; use Log::Dispatch::FileRotate; my $conf = q( log4perl.category.Script = DEBUG, ScriptFileRotateAppender log4perl.appender.ScriptFileRotateAppender = Log::Dispatch +::FileRotate log4perl.appender.ScriptFileRotateAppender.filename = evalseq.log log4perl.appender.ScriptFileRotateAppender.mode = append log4perl.appender.ScriptFileRotateAppender.size = 10000000 log4perl.appender.ScriptFileRotateAppender.max = 50 log4perl.appender.ScriptFileRotateAppender.layout = PatternLayout log4perl.appender.ScriptFileRotateAppender.layout.ConversionPattern= +[%p] %d %m%n ); Log::Log4perl::init( \$conf ); my $log = Log::Log4perl::->get_logger(q(Script)); my $count = 0; until ($@) { $count++; eval 'my $abc = 1; $abc++; '; unless ($count % 1e5) { $log->info( "count=$count evalseq=" . evalseq() ); } } $log->error( $count .' ' .$@ ); __END__ $ perl -v This is perl, v5.10.0 built for x86_64-linux $ uname -a Linux desktop 2.6.18-194.8.1.el5 #1 SMP Thu Jul 1 19:04:48 EDT 2010 x8 +6_64 x86_64 x86_64 GNU/Linux
The output:
[INFO] 2011/01/31 10:03:54 count=100000 evalseq=100026 [INFO] 2011/01/31 10:03:55 count=200000 evalseq=200027 [INFO] 2011/01/31 10:03:56 count=300000 evalseq=300028 [INFO] 2011/01/31 10:03:58 count=400000 evalseq=400029 [INFO] 2011/01/31 10:03:59 count=500000 evalseq=500030 ... [INFO] 2011/01/31 13:45:23 count=1073300000 evalseq=1073310758 [INFO] 2011/01/31 13:45:24 count=1073400000 evalseq=1073410759 [INFO] 2011/01/31 13:45:26 count=1073500000 evalseq=1073510760 [INFO] 2011/01/31 13:45:27 count=1073600000 evalseq=1073610761 [INFO] 2011/01/31 13:45:28 count=1073700000 evalseq=1073710762 [ERROR] 2011/01/31 13:45:29 1073733744 Global symbol "$abc" requires e +xplicit package name at (eval 1073744507) line 1.

Update 3: Reproduced in 7 hours with perl v5.10.1 MSWin32-x86-multi-thread:

use strict; use warnings; use Log::Log4perl; my $conf = q( log4perl.category.Script = DEBUG, ScriptFileRotateAppender, + Screen log4perl.appender.ScriptFileRotateAppender = Log::Log4perl +::Appender::File log4perl.appender.ScriptFileRotateAppender.filename = evalseq.log log4perl.appender.ScriptFileRotateAppender.mode = append log4perl.appender.ScriptFileRotateAppender.size = 10000000 log4perl.appender.ScriptFileRotateAppender.max = 50 log4perl.appender.ScriptFileRotateAppender.layout = PatternLayout log4perl.appender.ScriptFileRotateAppender.layout.ConversionPattern= +[%p] %d %m%n log4perl.appender.Screen = Log::Log4perl::Appender::Screen log4perl.appender.Screen.stderr = 0 log4perl.appender.Screen.layout = PatternLayout log4perl.appender.Screen.layout.ConversionPattern=[%p] %d %m%n ); Log::Log4perl::init( \$conf ); my $log = Log::Log4perl::->get_logger(q(Script)); my $count = 0; until ($@) { $count++; eval 'my $abc = 1; $abc++; '; unless ($count % 1e5) { $log->info( "count=$count" ); } } $log->error( "count=$count " .$@ ); __END__
Output:
[INFO] 2011/02/01 13:53:09 count=100000 [INFO] 2011/02/01 13:55:12 count=100000 [INFO] 2011/02/01 13:55:14 count=200000 ... [INFO] 2011/02/01 20:49:06 count=1073500000 [INFO] 2011/02/01 20:49:08 count=1073600000 [INFO] 2011/02/01 20:49:11 count=1073700000 [ERROR] 2011/02/01 20:49:12 count=1073740465 Global symbol "$abc" requ +ires explicit package name at (eval 1073740483) line 1.
--
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

Replies are listed 'Best First'.
Re: Interpreter eval counter out of bounds
by dave_the_m (Monsignor) on Jan 28, 2011 at 13:39 UTC
    Well, the "eval counter" (PL_evalseq) is a 32-bit unsigned integer, so I'd expect crashing and burning, if any, to happen after 4.29E9 evals.

    Do you have the ability to log the evals it's executing, in order to see whether the eval compile warnings/errors make any sense?

    Dave.

      Do you have the ability to log the evals it's executing

      I think Devel::NYTProf provides that ability. At least it has to capture the eval'ed text internally to profile it; maybe it provides an API, or you can grep for the eval'ed text in the report.

      Andreas' coworker here!

      The strange thing is that once this error occurs, evals that previously worked all of a sudden fail. Some of the code that is evaled is in fact logged to file (under /tmp), and they are valid:

      username@server tmp$ /opt/perl/bin/perl -c /tmp/agent_code.4561.pl 
      /tmp/agent_code.4561.pl syntax OK
      
      This is code that works for days, and works again after a restart of the application (which of course means we get a fresh interpreter to mess up)
      Do you have the ability to log the evals it's executing, in order to see whether the eval compile warnings/errors make any sense?
      In theory yes, but I assume impact on performance would be unacceptable as well as the log files themselves would be difficult to manage (log rotation would happen too quickly). I have not been able to reproduce the problem outside of a specific production environment with certain difficult-to-change log requirements.
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

        I have a perl daemon that in principle (unless stopped or restarted) will run forever. . .

        All of our production scripts restart themselves after some time is passed, ( i.e. 4-hours, 24-hours, etc. ).

        If you have 'cron' you might want to restart the process periodically. If you can't use 'cron' then the script can test on how much time has passed and use 'system' to restart itself and then exit.

        Your script will be happy with a fresh start.

        "Well done is better than well said." - Benjamin Franklin

      if this is actually the case - then - fortunately a bug could be fixed, and even a small XS "fix" module could be created to "reset" said variable PL_evalseq (I hope it is exported?)
Re: Interpreter eval counter out of bounds
by sundialsvc4 (Abbot) on Jan 28, 2011 at 14:05 UTC

    Yes, “look in the mirror.”   (No offense intended.)   There must be a flaw bug in this code.   It rather looks to me like the code is somehow trying to eval a string that consists of an integer value.

    I think you need to pore through some of those existing logfiles, e.g. with grep, because I suspect that you will find millions of occurrences of this whatever-it-is error.   And such an error could turn out to be quite serious, because I doubt that you would have intended to be evaluating an integer ... which leads to the question, “how did it get such a thing?” ... which of course also leads to, “what else is wrong?”

      Actually, there are likely bugs in Perl. Using mod_perl results in instances of the Perl interpreter running for quite a long time doing quite a variety of things. This can shake out bugs that are hard to reproduce.

      When running mod_perl, I've seen particular instances of the Perl interpreter get into states where regex matching just doesn't work quite right any longer, to pick one example.

      I'm not terribly surprised that after 1e9 string evals, you often collect enough entropy that things that should work rather mysteriously no longer work.

      The solution of restarting is pretty much a best practice:

      exec( $^X, $0, @ARGV ); die "Can't exec self: $!\n";

      I'd also question why one has a need to do 1e9 string evals. This concern is increased by the statement "evals that previously worked all of a sudden fail" which seems to indicate that the evals are being given the same strings over and over.

      Running string eval on the same string more than once makes me suspect that perhaps block eval should have been used instead. Or perhaps string eval should be used to make a sub that can be called over and over. But it can be little more than speculation, of course.

      Update: And I doubt that the eval count being close to 1e9 is significant. You could make a tiny, stand-alone demonstration that just does string evals and see when it falls over. If the count were close to ~0 or ~0/2 or even if it had just passed 1e9-1, then I could see how it might be a limit.

      - tye        

Re: Interpreter eval counter out of bounds
by BrowserUk (Patriarch) on Jan 29, 2011 at 21:59 UTC

    Have you checked whether your process is leaking memory?

Re: Interpreter eval counter out of bounds
by ikegami (Patriarch) on Jan 28, 2011 at 19:06 UTC

    "Will not stay shared" is a warning that indicates one made the error of nesting nesting named subs. Specifically, it is issued when an inner named sub captures vars from the outer sub.

    $ perl -we'sub outer { my $x; sub inner { $x } }' Variable "$x" will not stay shared at -e line 1.

    Don't nest named subs. It's not like inner is private to outer.

    "Global symbol requires explicit package name" is an error detected by "use strict;".

    $ perl -we'use strict; $i' Global symbol "$i" requires explicit package name at -e line 1. Execution of -e aborted due to compilation errors.

    Loading Carp::Always might be a simple way of getting useful diagnostics.

    I notice the eval counter (if such a thing exists) in each case seems to be very close to 10**9.

    109 is not a particularly interesting number to computers. If it was 22n or 22n-1 (e.g. 28, 215, 216, 231, 232, etc), then it would get my attention.

    (Why oh why do you generate and execute so many code snippets!)

      We are well aware that we should not nest nested named subs. We don't do that. And the code works normally. It's just that all of a sudden the code stops working (well, the perl interpreter stops working it seems), and the eval counter is at roughly the same position each time the code stops working. Given that the code is non-deterministic, what portion of the code that fails is random.

      Some of the code snippets that increase the eval counter are well justified, some are perhaps not. But currently, the non-deterministic application is crashing on a regular basis, and the only common factor we've found is that the counter is around 999970000. It seems like a major coincidence.

        I'm currently running the following code snippet:

        perl -Mstrict -wle "my@r;$|++;while($_++<$ARGV[0]){eval sprintf q{my$s +elf=\%s;push @r,sub{ $self }},$_,$_; if(!($_%100000)){printf qq(%010d +\r),$_;@r=()}}" 1000000000

        but I've only reached 479_000_000 iterations so far, doing about 10_000 evaluations per second. If that doesn't turn up any errors, maybe we should work on code that pumps up the counter more quickly to 999_000_000 string evals and then slowly iterates there.

        I think what would help was some more information about the nature of the code you're evaluating. Is my code anywhere near, creating closures? Or don't closures come into play with your code at all?

        Update: The code has run through (on Strawberry Perl 5.12) without a hitch.

        Update: Running the code for somewhat longer actually produces the error on Strawberry 5.12 for me too:

        Expect crash1073700000 Expect crashGlobal symbol "$abc" requires explicit package name at (ev +al 1073741769) line 1.
        One thing that could support the theory due to interpreter corruption is that both errors are related to lexical variables. It could be a problem with the pad of the function doing the eval.
Re: Interpreter eval counter out of bounds
by Corion (Patriarch) on Feb 06, 2011 at 20:14 UTC

    dave_the_m just pushed a set of changes that should fix this. It seems the (deep) cause was an overflow of a sequence counter that reached a magic value.

    1. p5git://commit=d1bfb649a8c4cc0b39b0fd6371c009b71c0b497f - changes PAD_MAX from I32_MAX to U32_MAX
    2. p5git://commit=2df5bdd700b8a6ef036aa5be19d4bda9e3ca20bc - renames PAD_MAX to PERL_PADSEQ_INTRO, as the value is now used as a magic value to signify lexical variable introduction
    3. p5git://commit=0d311cdbe540a08e243ac533c7bcdb73529352db - introduces (and documents) the states that an SV can have in a loop (I'm not sure I understand the mechanics involved here):
      COP_SEQ_RANGE_LOW _HIGH ----------------- ----- PERL_PADSEQ_INTRO 0 variable not yet introduced: { m +y ($x valid-seq# PERL_PADSEQ_INTRO variable in scope: { m +y ($x) valid-seq# valid-seq# compilation of scope complete: { m +y ($x) }
    4. p5git://commit=6012dc8015d0c5599f09bcca302ae714d023aced - allow the sequence number to wrap instead of reaching the magic value.
      Surprisingly, this patch touches a lot of places where I would have thought it would touch just the place where the sequence number is incremented. But it seems that the sequence number is incremented in several places.

    Looking at the changes involved, I'm not sure whether the eval was necessary to trigger the bug, but that may be due to my lack of understanding the mechanics involved.

      Dave writes:
      Its a good candidate for maint a cherry-pick.
      Does that mean there is a chance we might see the patch back-ported to older perls, e.g. 5.8 and 5.10 ?
      --
      No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

        No. There is no maintainer for 5.8 or 5.10, so it's highly unlikely. The only Perl version in active maintenance is 5.12. I remember Zefram musing about doing a maintenance release of 5.10 (that would be 5.10.3), but I also remember the discussion mostly leaning towards build failures and not doing real work there.

        Update: Found the p5p post about doing a 5.10.2 and possibly 5.10.3. I guess you could lobby Zefram for including that patch, but there is little guarantee for anything.