Exec'd perl running wild on cpu-time

jeroenes has asked for the wisdom of the Perl Monks concerning the following question:

As a humble novice, I would like to ask some advice about a perl-script which I run in background mode. Actually, it is more a How-Do-I-Run-This question than a perl-coding question, methink. Let's first start with a short description of the script.

The script wakes up (should wake up) every 5 minutes, go and retrieve traffic-jam info from a dutch site, log it, and when the time is right, it should report this info in an e-mail to my wife, who uses this info to determine the time of her homily return. As you can imagine, this script really is a part of our married life ;-).

The main loop is as follows:

while (1){
  @the_html=init_html();
  @data=parse_data(@the_html);
  if (@data) {
    write_data(filter_data(@data));
    mail_data(filter_data(@data));
  } else {
    write_data("\tNo connection or data, so","\tsleeping until next fe
+tch.");
  }
  sleep(300);
}
[download]

and I use IO::Socket to retrieve the web-page. The script runs just fine, but after a couple of days it starts to eat CPU time so I have to kill the process.

Normally I ran it with exec get_files.pl files &, after which I killed the terminal. But lately I discovered that if I leave the terminal open, the script keeps running without a problem. At the moment it has been running for one week without a problem.

It runs, my wife is happy, and so am I. But, I still don't know why the script goes mad when I close the terminal. Is there a reason? Am I doing something wrong? I'm very curious about this.

If your (our?) great society of monks could come up with an answer, a humble novice would be made very happy.

May the Perl be with you,

Jeroen

PS: If any ofyou happen to live in Holland, and are interested in the script itself, don't hesitate to contact me.

Comment on Exec'd perl running wild on cpu-time Select or Download Code

Replies are listed 'Best First'.
Re: Exec'd perl running wild on cpu-time by Corion (Patriarch) on Nov 13, 2000 at 18:05 UTC
If you want a script to run periodically, this is always a call for `cron`, the program that starts tasks under the several variants of Unix. Then you can do away with the `while(1){}` loop and let your program run just once every 5 minutes. If you are working under Win32, there is either the dreaded Scheduler service under Windows 9x (avoid it, it needs a person logged in), or the `at` service under NT, which has a different syntax but does what `cron` does under Unix. Much more interesting though is, why your Perl program racks up that much CPU time at all, but I'm at a loss here.	[reply]
RE: Re: Exec'd perl running wild on cpu-time by jeroenes (Priest) on Nov 13, 2000 at 18:19 UTC
Actually, I run the script continously. But, before finding the "open terminal" solution, I was thinking about making a cron job to periodically kill and restart the script. This way, I could get rid of the open terminal day and night and weekends... Indeed, I consider the CPU-black-holish thing puzzling. Maybe there is (and I hope so) a obvious/ stupid flaw in my way of handling the script.	[reply]
RE: RE: Re: Exec'd perl running wild on cpu-time by Corion (Patriarch) on Nov 13, 2000 at 18:27 UTC
Of course, this is drifting away from Perl, but in the spirit of the Right Tool for The Right Job, `cron` knows all about weekdays and time slots. I would use `cron` (or `at`, which can do that stuff under NT) with the following `crontab` entry to ensure that it checks every 5 minutes during the week, from 7:00h to 20:00h, monday through friday : `0,5,10,15,20,25,30,35,40,45,50,55 7-20 * * 1-5` [download] Note: I didn't check that line and also I only worked from the `crontab (5)` manpage. Usually this means that some tweaking is required afterwards.	[reply] [d/l]
(ar0n) RE (4): Exec'd perl running wild on cpu-time by ar0n (Priest) on Nov 13, 2000 at 19:45 UTC
Re: Exec'd perl running wild on cpu-time by Fastolfe (Vicar) on Nov 13, 2000 at 18:33 UTC
Your main loop offers no clues at all as to the CPU problem. Have you tried using strict and running Perl with warnings (`-w`) enabled? Note that your `filter_data` function is being called twice: once for write_data and once for mail_data. If this is an expensive function you should consider running it once, storing the results, and pass those results to each of the two functions. If your main functions are too long to post, consider just reading through them and check your assumptions and error checking. What happens if it can't get a response? What happens if the response it gets doesn't follow your expectations? Are you looping and using a variable to determine when the loop should end? If so, are you sure this variable is allowing the loop to exit when unexpected things happen? In the past, I've occasionally used 'strace' (or 'truss' or 'ptrace', depending on your OS) against a process stuck like that. That usually lets me see what system calls it's trying to do (if any). Sometimes this points me to the right place in my code.	[reply] [d/l] [select]
RE: Re: Exec'd perl running wild on cpu-time by jeroenes (Priest) on Nov 13, 2000 at 20:11 UTC
Quite a few remarks... thank you! 1. No warnings. I still haven't looked into the module programming, so I don't know what to do with the 'import' stuff. Has been moved into my 'todo' list.... 2. Indeed, the function is called twice. Normally, CPU time is no issue, it's only two minutes/day MAX. And I think most time is spend waiting for the HTML content. 3. If any connection or whatever fails, the script is simply terminated with a die. No problems from that perspective (ie, a terminated script) thusfar. 4. If a content is not matched, the filter just returns an empty array, and only the 'templates' ar logged/mailed. The filter routine, and the parse routine are quite simple, just some coupled regex's that fill some arrays. If something goes wrong, an array will be empty, that's all. If a connection fails, the script will die(). 5. see next answer Jeroen I was dreaming of guitarnotes that would irritate an executive kind of guy (FZ)	[reply]
RE: RE: Re: Exec'd perl running wild on cpu-time by Fastolfe (Vicar) on Nov 13, 2000 at 20:26 UTC
Another thing I would recommend is setting yourself up with a debugging log. Have it log the last few (or all) of its requests and the results it gets back from the server. The next time your script misbehaves, take a look at this log to see what it's acting upon. If you want to put 'markers' or 'checkpoints' in various places in and around your loops, so that information is logged as your program reaches various points of execution, this would also let you trace the flow of execution, though if your script is entering an infinite loop (which I suspect it is), this log file will fill up fast.	[reply]
Re: Exec'd perl running wild on cpu-time by ChOas (Curate) on Nov 13, 2000 at 18:44 UTC
Eeehm dunno if this helps, but seeing you use 'kill' kinda makes me guess you use a form of *NIX, though this is not a real answer, it might help, inserting 'nohup' in front of your program will let it ignore the HUP signal (gets sent to your program when you kill the terminal... `e.g.: nohup exec get_files.pl files &` [download] Bye!!	[reply] [d/l]
RE: Re: Exec'd perl running wild on cpu-time by jeroenes (Priest) on Nov 13, 2000 at 20:18 UTC
Aha! Ambient light fills my spinning head.... I would say this should be THE answer. I killed the script, and restarted it with the nohup. Now I'm heading for the monastery's library.... Cheers, and thanx 2u all! Jeroen I was dreaming of guitarnotes that would irritate an executive kind of guy (FZ)	[reply]
RE:{3} Exec'd perl running wild on cpu-time by jeroenes (Priest) on Nov 13, 2000 at 21:13 UTC
Hmmm... reply to my own reply... spells disaster, doesn't it? Well, according to the library, the `nohup` could also be accomplished by using `$SIG{INT} = 'IGNORE';`. That's at least a more perlish solution ;-). May the Pearl be at your convenience, and thanks for the fishes. Or whatever. Jeroen Post Posting: I'll keep your _posted_ about the stability of the script. Thanks. Post Post Posting: Oh yeah, 'file' is dutch for 'traffic jam'. In case you were wondering about the awkard use of 'file' above.	[reply] [d/l] [select]
RE: RE:{3} Exec'd perl running wild on cpu-time by ChOas (Curate) on Nov 14, 2000 at 12:03 UTC
RE: Re: Exec'd perl running wild on cpu-time by bastard (Hermit) on Nov 15, 2000 at 04:11 UTC
Doing something like this seems to have the same effect (it's what i use): `(exec get_files.pl files &)` [download] I wish I knew what the difference was.	[reply] [d/l]
Re: Exec'd perl running wild on cpu-time by elwarren (Priest) on Nov 13, 2000 at 21:09 UTC
If the nohup helped your problem then I would assume that it's some sort of issue with your os handling STDOUT and STDERR after detaching the terminal. If you are happy with the nohup solution (a good solution) then you could leave it, but if you're still interested in tracking down your bug I would redirect the STDOUT and STDERR to a log as Fastolfe mentioned a couple posts above. This may also solve your problem. You could also take a look at the net::daemon or proc::daemon modules too.	[reply]
RE: Re: Exec'd perl running wild on cpu-time by jeroenes (Priest) on Nov 13, 2000 at 21:25 UTC
Wow! More useful tips! Yes, I actually already have shifted to the use of 2&>1 >>log.out. This may have solved the problem as well, as I don't recall the day I switched to the redirection. For the curious, the logfile is still empty, but I don't dump any of the data, because the log file clutters up too many bits. I'll certainly look into the deamon-modules. It feels like 42... Update: I changed the script at several points. 1. I read the docs on Package and stuff, and I use `use strict;` now. 2. The `nohup exec` command resulted in an immediate exit. I think it should be plain `nohup`. Anyway, I now use `SIG{INT}='IGNORE'`, and I didn't try the nohup/() variants. 3. There is more to starting daemons than I thought. I read the Net:: Proc::daemon PODs, and there is a complete sequence needed for detaching from a terminal. Because I don't need sockets for my script, I will use Proc::daemon. 4. However, I postpone coding this for a while. For now, the script has been running for more than a day now, without problems. Thanks to you all!	[reply] [d/l] [select]


Do you know where your variables are?
	PerlMonks