Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

need help debugging perl script killed by SIGKILL

by expo1967 (Sexton)
on Mar 01, 2021 at 19:57 UTC ( [id://11128960]=perlquestion: print w/replies, xml ) Need Help??

expo1967 has asked for the wisdom of the Perl Monks concerning the following question:

at the office I am working on threaded perl script on a linux system. at first just as a test I had the script work on a sub set of the large data file.

I gradually increased the amount of data to be processed by the perl script and everything seemed to be fine. I seem to have reached a limit of some kind, my script now dies with a killed message and the exit status is 137 (which means killed by a SIGKILL signal)

I have the main program create and load a threaded queue with all of the data records from the large data file. I next have the main script start all the threads (currently 35 threads). First the thread sets an element in a shared hash to indicate that it has started. Next the thread loops over elements from the queue using a non-blocking fetch until nothing is returned. Finally, the thread sets the shared hash to indicate it is done.

My script detaches the threads after they are started.

After all of the threads are started, the main script goes into a loop waiting for all the threads to mark their status flags as done. After a certain period of time if not all the flagfs indicate DONE then the main script prints a message and exits.

Both the main script and the threads use signal handling. No signals were caught.

I am guessing that my script was likely killed due to a too much memory used problem. Does a perl threaded queue store its elements in memory ?

Any suggestions on what to check for memory issues or any other likely suspects ?

  • Comment on need help debugging perl script killed by SIGKILL

Replies are listed 'Best First'.
Re: need help debugging perl script killed by SIGKILL
by Fletch (Bishop) on Mar 01, 2021 at 20:28 UTC

    A SIGKILL can't be caught by design so whatever signal handling's in play wouldn't enter into the picture. You might check the output from dmesg (and/or ask your sysadmin to check in /var/log/messages) and see if there's anything untoward there; if you're running the box out of free memory the kernel's OOM killer may have left notes there if it knocked your process out.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: need help debugging perl script killed by SIGKILL
by bliako (Monsignor) on Mar 01, 2021 at 21:30 UTC

    In general, when a new thread starts it gets a private copy of each existing variable (see threads::shared). In OS's that utilise copy-on-write (e.g. Linux) the effect of this may not be immediate although Perl does try to make private copies. It's good practice before starting a thread (or 35 of them) destroy all unwanted data from memory.

    You mention you are using Thread::Queue. Its doc states in https://perldoc.perl.org/Thread::Queue#DESCRIPTION that

    Ordinary scalars are added to queues as they are. If not already thread-shared, the other complex data types will be cloned (recursively, if needed, and including any blessings and read-only settings) into thread-shared structures before being placed onto a queue.

    As I understand it, your data going into the queue must be declared :shared otherwise it will be duplicated and then :shared 35 times.

    bw, bliako

      Thanks for the reply. I modified my $queue->enqueue() operation to use a shared variable, but the same problem still occurs.

      When you stated "your data going into the queue must be declared :shared ", How do I do that ? I have been google searching and have not found anything on how to accomplish this for enqueue operations

        The link I posted peripherally shows how to enqueue a blessed hash (object). And threads::shared has an example on how to create a shared hash which contains other shared items.

        use threads; use threads::shared; use Thread::Queue; my %hash; share(%hash); my $scalar; share($scalar); my @arr; share(@arr); # or # my (%hash, $scalar, @arr) : shared; $scalar = "abc"; $hash{'one'} = $scalar; $hash{'two'} = 'xyz'; $arr[0] = 1; $arr[1] = 2; $hash{'three'} = \@arr; my $q = Thread::Queue->new(); # A new empty queue $q->enqueue(\%hash);

        The "pitfall" I had in mind is this:

        my $hugedata = <BIGSLURP>; my (%hash) : shared; %hash = process($hugedata); # perhaps filtering or rearranging it into + a hash # $hugedata = undef; # <<< if done with it, then unload it, otherwise +... threads->create(...); # ... hugedata is duplicated, %hash is not.

        Memory is not the only reason the kernel can kill your process, perhaps "too many" threads will have the same effect. So, you should also find out the exact messages in /var/log/messages and the output of dmesg as Fletch suggested. Additionally, you must measure memory usage exactly (as opposed to just observing a SIGKILL maybe because of memory). If you are in some sort of *nix that's easy.

        bw, bliako

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11128960]
Approved by LanX
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-23 06:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found