Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Threaded script eating memory

by suaveant (Parson)
on Jun 22, 2009 at 19:57 UTC ( [id://773751]=perlquestion: print w/replies, xml ) Need Help??

suaveant has asked for the wisdom of the Perl Monks concerning the following question:

Hey all, I seem to have a memory leak in my threaded script... I haven't done much thread work so it is probably an oversight on my part, I'd appreciate a look.

This is a simplified version that exhibits the problem, I am trying a classic method of one worker filling a queue with multiple workers reading from the queue and managing the data, my total dataset is over a million records so I am breaking it into 10000 record chunks, but it would seem they are hanging around in memory. This example just makes data forever but clearly shows that the data is sticking around when you watch top.

I am using Perl 5.8.5, threads 1.73, threads::shared 1.29 and Thread::Queue 2.11, all on Linux

use threads; use threads::shared qw(share); use Thread::Queue; use Thread::Semaphore; my $q : shared = Thread::Queue->new(); my $loading : shared = 1; my $threads : shared = 0; my $thr = threads->create( \&load_queue, $q, \$loading ); $thr->detach(); $|++; my @threads; for ( 1 .. 6 ) { my $thr = threads->create( \&{type_issues}, $q, \$loading, $thread +s ); push @threads, $thr; } $_->join() for @threads; sub load_queue { my $q = shift; my $loading = shift; my $i = 1; my $ary = &share( [] ); while ( my $v = ["this is a test" x 250] ) { # print "$v->[0]\n"; push @$ary, $v->[0]; unless ( $i++ % 10000 ) { print "$i\n"; $q->enqueue($ary); $ary = &share( [] ); while ( $q->pending() > 20 ) { # try not to eat tooo mu +ch memory select(undef,undef,undef,.1); } } } $q->enqueue($ary) if @$ary; $$loading = 0; } sub type_issues { my ( $q, $loading ) = @_; #my $DB = IDC::Data->new( bes => 'webdev' ); print "Type issues\n"; my $issues; while ( ( $issues = $q->dequeue() ) ) { print "Got ".@$issues." - ".$q->pending()."\n"; select(undef,undef,undef,.1); } }

                - Ant
                - Some of my best work - (1 2 3)

Replies are listed 'Best First'.
Re: Threaded script eating memory
by BrowserUk (Patriarch) on Jun 22, 2009 at 20:51 UTC
    I am using Perl 5.8.5,

    Try upgrading your perl.

    I see no memory leaks at all on my system under both:

    • Perl v5.8.9(32-bit) threads v1.71 threads::shared v1.27 Thread::Queue v2.11

      Memory usage goes directly to 262.6MB and stays exactly there.

    • Perl v5.10(64-bit) threads v1.71 threads::shared v1.26 Thread::Queue v2.11

      Memory usage goes directly to 262.4MB and stays exactly there.

    Update: Probably irrelevant as this is only demo code, but you are doing some mighty peculiar things.

    1. There is no need to share my $q : shared       = Thread::Queue->new();

      They are created shared.

    2. Why are you creating your big strings inside 1-element anonymous arrays:
      while ( my $v = ["this is a test" x 250] ) {
      to then pluck it out and push it onto your shared array:
      push @$ary, $v->[0];

    There is weirdness with your whole architecture, but it's pointless debugging a demo app. If you can post the real code, or a sanitised but still representative example, then I would take a look at that.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Not the simplest of options... I'm on a shared system managed by other people... I can try on another system of mine.. maybe I can convince people to upgrade... or maybe I'll just have to re-write it in a less clever fashion.

                      - Ant
                      - Some of my best work - (1 2 3)

        There really is no better way to work around fixed core bugs than upgrade.

        However, if memory serves, the main problems with threads in 5.8.5 was leaking closures. It is possible that my rearranging your code (physically within the source file) to avoid unnecessary closures you might be able to avoid the leaks you are seeing.

        The first step is to move all your subroutines to the top of the source file after the use lines, but before you declare any package level variables. Eg: for your demo code, you might do something like this:

        use threads; use threads::shared qw(share); use Thread::Queue; use Thread::Semaphore; sub load_queue { my $q = shift; my $loading = shift; my $i = 1; my $ary = &share( [] ); while ( my $v = ["this is a test" x 250] ) { # print "$v->[0]\n"; push @$ary, $v->[0]; unless ( $i++ % 10000 ) { print "$i\n"; $q->enqueue($ary); $ary = &share( [] ); while ( $q->pending() > 20 ) { # try not to eat tooo mu +ch memory select(undef,undef,undef,.1); } } } $q->enqueue($ary) if @$ary; $$loading = 0; } sub type_issues { my ( $q, $loading ) = @_; #my $DB = IDC::Data->new( bes => 'webdev' ); print "Type issues\n"; my $issues; while ( ( $issues = $q->dequeue() ) ) { print "Got ".@$issues." - ".$q->pending()."\n"; select(undef,undef,undef,.1); } } my $q = Thread::Queue->new(); my $loading : shared = 1; my $threads : shared = 0; my $thr = threads->create( \&load_queue, $q, \$loading ); $thr->detach(); $|++; my @threads; for ( 1 .. 6 ) { my $thr = threads->create( \&{type_issues}, $q, \$loading, $thread +s ); push @threads, $thr; } $_->join() for @threads;

        Note:That makes no attempt to address any of the other issues I mention in my update to my first reply above.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Different versions of Perl can happily co-habit one machine.
      The anonymous array and the $v->[0] are just remnants of the real code which did a fetchrow_arrayref.

      I didn't realize the Queues were shared though it makes sense.

                      - Ant
                      - Some of my best work - (1 2 3)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://773751]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2024-03-28 22:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found