jeroenes has asked for the wisdom of the Perl Monks concerning the following question:
As a humble novice, I would like to ask some advice about
a perl-script which I run in background mode.
Actually, it is more a How-Do-I-Run-This question than a
perl-coding question, methink. Let's first start with a short
description of the script.
The script wakes up (should wake up) every 5 minutes, go and
retrieve traffic-jam info from a dutch site, log it, and when
the time is right, it should report this info in an e-mail to
my wife, who uses this info to determine the time of her homily
return. As you can imagine, this script really is a part of our
married life ;-).
The main loop is as follows: while (1){
@the_html=init_html();
@data=parse_data(@the_html);
if (@data) {
write_data(filter_data(@data));
mail_data(filter_data(@data));
} else {
write_data("\tNo connection or data, so","\tsleeping until next fe
+tch.");
}
sleep(300);
}
and I use IO::Socket to retrieve the web-page. The script runs
just fine, but after a couple of days it starts to eat CPU time
so I have to kill the process.
Normally I ran it with exec get_files.pl files &,
after which I killed the terminal. But lately I discovered that if
I leave the terminal open, the script keeps running without a problem.
At the moment it has been running for one week without a problem.
It runs, my wife is happy, and so am I. But, I still don't know
why the script goes mad when I close the terminal. Is there a reason?
Am I doing something wrong? I'm very curious about this.
If your (our?) great society of monks could come up with an
answer, a humble novice would be made very happy.
May the Perl be with you,
Jeroen
PS: If any ofyou happen to live in Holland, and are interested
in the script itself, don't hesitate to contact me.
Re: Exec'd perl running wild on cpu-time
by Corion (Patriarch) on Nov 13, 2000 at 18:05 UTC
|
If you want a script to run periodically, this is always
a call for cron, the program that starts tasks
under the several variants of Unix.
Then you can do away with the while(1){} loop
and let your program run just once every 5 minutes.
If you are working under Win32, there is either the
dreaded Scheduler service under Windows 9x (avoid it,
it needs a person logged in), or the at service
under NT, which has a different syntax but does what
cron does under Unix.
Much more interesting though is, why your Perl program
racks up that much CPU time at all, but I'm at a loss here.
| [reply] |
|
| [reply] |
|
Of course, this is drifting away from Perl, but in the
spirit of the Right Tool for The Right Job,
cron knows all about weekdays and time slots.
I would use cron (or at, which can do that
stuff under NT) with the following crontab
entry to ensure that
it checks every 5 minutes during the week,
from 7:00h to 20:00h, monday through friday :
0,5,10,15,20,25,30,35,40,45,50,55 7-20 * * 1-5
Note: I didn't check that line and also I only worked
from the crontab (5) manpage. Usually this
means that some tweaking is required afterwards.
| [reply] [d/l] |
|
Re: Exec'd perl running wild on cpu-time
by Fastolfe (Vicar) on Nov 13, 2000 at 18:33 UTC
|
Your main loop offers no clues at all as to the CPU problem. Have you tried using strict and running Perl with warnings (-w) enabled?
Note that your filter_data function is being called twice: once for write_data and once for mail_data. If this is an expensive function you should consider running it once, storing the results, and pass those results to each of the two functions.
If your main functions are too long to post, consider just reading through them and check your assumptions and error checking. What happens if it can't get a response? What happens if the response it gets doesn't follow your expectations? Are you looping and using a variable to determine when the loop should end? If so, are you sure this variable is allowing the loop to exit when unexpected things happen?
In the past, I've occasionally used 'strace' (or 'truss' or 'ptrace', depending on your OS) against a process stuck like that. That usually lets me see what system calls it's trying to do (if any). Sometimes this points me to the right place in my code. | [reply] [d/l] [select] |
|
Quite a few remarks... thank you!
1. No warnings. I still haven't looked into the module programming,
so I don't know what to do with the 'import' stuff. Has been moved
into my 'todo' list....
2. Indeed, the function is called twice. Normally, CPU time is
no issue, it's only two minutes/day MAX. And I think most time
is spend waiting for the HTML content.
3. If any connection or whatever fails, the script is simply
terminated with a die. No problems from that perspective (ie,
a terminated script) thusfar.
4. If a content is not matched, the filter just returns an empty
array, and only the 'templates' ar logged/mailed. The filter routine,
and the parse routine are quite simple, just some coupled regex's
that fill some arrays. If something goes wrong, an array will
be empty, that's all. If a connection fails, the script will
die().
5. see next answer
Jeroen
I was dreaming of guitarnotes that would irritate an executive kind of guy (FZ)
| [reply] |
|
Another thing I would recommend is setting yourself up with a debugging log. Have it log the last few (or all) of its requests and the results it gets back from the server. The next time your script misbehaves, take a look at this log to see what it's acting upon. If you want to put 'markers' or 'checkpoints' in various places in and around your loops, so that information is logged as your program reaches various points of execution, this would also let you trace the flow of execution, though if your script is entering an infinite loop (which I suspect it is), this log file will fill up fast.
| [reply] |
Re: Exec'd perl running wild on cpu-time
by ChOas (Curate) on Nov 13, 2000 at 18:44 UTC
|
Eeehm dunno if this helps, but seeing you use 'kill' kinda
makes me guess you use a form of *NIX, though this is not a real
answer, it might help, inserting 'nohup' in front of your
program will let it ignore the HUP signal (gets sent to your
program when you kill the terminal...
e.g.: nohup exec get_files.pl files &
Bye!!
| [reply] [d/l] |
|
Aha! Ambient light fills my spinning head.... I would say
this should be THE answer. I killed the script, and restarted
it with the nohup. Now I'm heading for the monastery's library....
Cheers, and thanx 2u all!
Jeroen
I was dreaming of guitarnotes that would irritate an executive kind of guy (FZ)
| [reply] |
|
Hmmm... reply to my own reply... spells disaster, doesn't it?
Well, according to the library, the nohup could
also be accomplished by using $SIG{INT} = 'IGNORE';.
That's at least a more perlish solution ;-).
May the Pearl be at your convenience, and thanks for the fishes.
Or whatever.
Jeroen
Post Posting: I'll keep your _posted_ about the stability of the script. Thanks.
Post Post Posting: Oh yeah, 'file' is dutch for 'traffic jam'. In case you
were wondering about the awkard use of 'file' above.
| [reply] [d/l] [select] |
|
|
Doing something like this seems to have the same effect (it's what i use):
(exec get_files.pl files &)
I wish I knew what the difference was. | [reply] [d/l] |
Re: Exec'd perl running wild on cpu-time
by elwarren (Priest) on Nov 13, 2000 at 21:09 UTC
|
If the nohup helped your problem then I would assume that
it's some sort of issue with your os handling STDOUT and
STDERR after detaching the terminal. If you are happy
with the nohup solution (a good solution) then you could
leave it, but if you're still interested in tracking down
your bug I would redirect the STDOUT and STDERR to a log
as Fastolfe mentioned a couple posts above. This may also
solve your problem. You could also take a look at the
net::daemon or proc::daemon modules too. | [reply] |
|
Wow! More useful tips! Yes, I actually already have
shifted to the use of 2&>1 >>log.out. This may have
solved the problem as well, as I don't recall the day
I switched to the redirection. For the curious, the logfile
is still empty, but I don't dump any of the data, because the
log file clutters up too many bits. I'll certainly look into the
deamon-modules.
It feels like 42...
Update: I changed the script at several points.
1. I read the docs on Package and stuff, and I use
use strict; now.
2. The nohup exec command resulted in an immediate
exit. I think it should be plain nohup.
Anyway, I now use SIG{INT}='IGNORE', and
I didn't try the nohup/() variants.
3. There is more to starting daemons than I thought. I read
the Net:: Proc::daemon PODs, and there is a complete sequence
needed for detaching from a terminal. Because I don't need
sockets
for my script, I will use Proc::daemon.
4. However, I postpone coding this for a while. For now,
the script has been running for more than a day now,
without problems.
Thanks to you all!
| [reply] [d/l] [select] |
|
|