Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^2: Module for transparently forking a sub?

by samtregar (Abbot)
on Feb 13, 2009 at 20:50 UTC ( #743698=note: print w/replies, xml ) Need Help??


in reply to Re: Module for transparently forking a sub?
in thread Module for transparently forking a sub?

Heh. One way it could be easier is if there was a reliable way to know if a given Perl module is thread-safe! Most pure Perl code will be but XS modules often aren't unless someone has gone to the trouble of making them that way.

The same can be said of forking of course, you could say that DBD::mysql isn't fork-safe and you'd be kind of right. But there's definitely fewer problems with forking and XS code.

Also, the performance of threads, particularly for smallish tasks, is really quite bad. I know you're going to ask me to quantify that statement but I really don't have the time. I've seen it benchmarked plenty of times before though, so you can probably find a fork versus threads benchmark around.

-sam

  • Comment on Re^2: Module for transparently forking a sub?

Replies are listed 'Best First'.
Re^3: Module for transparently forking a sub?
by BrowserUk (Pope) on Feb 13, 2009 at 22:01 UTC

    So, you have time to make the claim, but not the time to substantiate it. There's a name for that:FUD!

    Okay, here my counter claim.

    I can start a thread, run a subroutine that returns a complex data structure, and retrieve that data structure to the calling code faster than you can do the same using fork. My timing is: 0.0261 seconds.

    c:\test>junk8 -N=100 Time taken: 0.0261 seconds { A => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ARGS => [1, "2.3", "four"], B => { a => "b", c => "d", e => "f", g => "h", i => "j", k => "l", "m" => "n", o => "p", "q" => "r", "s" => "t", u => "v", w => "x", "y" => "z", }, C => "Just a big scalarJust a big scala big scalarJust a big scalarJust a big sc }

    And my benchmark code:

    #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; our $N ||= 10; sub stuff { my %hash = ( ARGS => \@_, A => [ 1 .. 10 ], B => { 'a' .. 'z' }, C => 'Just a big scalar' x 100, ); return \%hash; } my $complexData; my $start = time; for ( 1 .. $N ) { ## "fork" the subroutine my( $thread ) = async \&stuff, 1, 2.3, 'four' ; ## Do other stuff sleep 1; ## Get the complex results $complexData = $thread->join; } printf "Time taken: %.4f seconds\n", ( time() - $start ) / $N - 1; ## Display them pp $complexData;

    Care to substantiate your claim and disprove mine?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Ok, I took the bait and my faith in the general slowness of threading remains intact. Here's my results:

      $ perl thread.pl -N=100 Time taken w/ threads: 0.0077 seconds Time taken w/ forks: 0.0021 seconds

      And here's my code:

      #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; use Data::Dump qw[ dump ]; use Storable qw[ store_fd fd_retrieve ]; our $N ||= 10; sub stuff { my %hash = (ARGS => \@_, A => [1 .. 10], B => {'a' .. 'z'}, C => 'Just a big scalar' x 100,); return \%hash; } my $complexData; { my $start = time; for (1 .. $N) { ## "fork" the subroutine my ($thread) = async \&stuff, 1, 2.3, 'four'; ## Do other stuff sleep 1; ## Get the complex results $complexData = $thread->join; } printf "Time taken w/ threads: %.4f seconds\n", (time() - $start) +/ $N - 1; } my $complexData2; { my $start = time; pipe(READ, WRITE); for (1 .. $N) { my $pid = fork; if (!$pid) { # in the kid - do stuff() and send it back to parent store_fd(stuff(1, 2.3, 'four'), \*WRITE); exit; } ## Do other stuff sleep 1; ## Get the complex results $complexData2 = fd_retrieve(\*READ); waitpid($pid,0); } printf "Time taken w/ forks: %.4f seconds\n", (time() - $start) / +$N - 1; } # data should match if (dump($complexData) ne dump($complexData2)) { warn "Data did not match!"; }

      As I was coding it I realized it's kind of a bizarre benchmark since it's not really testing any concurency. It's only testing how fast a single thread/process can be spawned and send back data. And really there's just no way Perl's threads are going to beat fork() at that test!

      One neat thing I learned - I didn't realize you could use use Storable's store_fd() and fd_retrieve() to pass messages like this. I'd previously used nstore() and thaw() with a prefixed length() of the message so the other end would know how much to read. This is so much easier!

      -sam

      PS: I just noticed you're on Windows (or DOS, I guess)! You don't have a real fork() there, so I guess you're not going to be able to replicate my results. Oh well.

      Oh fine. But you'll have to wait. It does sound fun, but real work beckons...

      -sam

      --Care to substantiate your claim and disprove mine? Sure, you cannot fork on windows.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://743698]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2020-07-02 17:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?