Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Module for transparently forking a sub?

by BrowserUk (Pope)
on Feb 13, 2009 at 17:11 UTC ( #743647=note: print w/replies, xml ) Need Help??


in reply to Module for transparently forking a sub?

My questions are does this already exist?

Yes! It's called threads::async(). Could it be any easier?

#! perl -slw use strict; use threads; use Data::Dumper; ## "fork" the subroutine my( $thread ) = async { my %hash = ( A => [ 1 .. 10 ], B => { 'a' .. 'z' }, C => 'Just a big scalar' x 100, ); return \%hash; }; ## Do other stuff sleep 10; ## Get the complex results my( $complexData ) = $thread->join; ## Display them print Dumper $complexData; __END__ c:\test>junk8 $VAR1 = { 'A' => [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ], 'C' => 'Just a big scalarJust a big scalarJust a ... alarJust a big scalarJust a big scalarJust a big scalarJus ... 'B' => { 'w' => 'x', 'e' => 'f', 'a' => 'b', 'm' => 'n', 's' => 't', 'y' => 'z', 'u' => 'v', 'c' => 'd', 'k' => 'l', 'q' => 'r', 'g' => 'h', 'i' => 'j', 'o' => 'p' } };

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Module for transparently forking a sub?
by samtregar (Abbot) on Feb 13, 2009 at 20:50 UTC
    Heh. One way it could be easier is if there was a reliable way to know if a given Perl module is thread-safe! Most pure Perl code will be but XS modules often aren't unless someone has gone to the trouble of making them that way.

    The same can be said of forking of course, you could say that DBD::mysql isn't fork-safe and you'd be kind of right. But there's definitely fewer problems with forking and XS code.

    Also, the performance of threads, particularly for smallish tasks, is really quite bad. I know you're going to ask me to quantify that statement but I really don't have the time. I've seen it benchmarked plenty of times before though, so you can probably find a fork versus threads benchmark around.

    -sam

      So, you have time to make the claim, but not the time to substantiate it. There's a name for that:FUD!

      Okay, here my counter claim.

      I can start a thread, run a subroutine that returns a complex data structure, and retrieve that data structure to the calling code faster than you can do the same using fork. My timing is: 0.0261 seconds.

      c:\test>junk8 -N=100 Time taken: 0.0261 seconds { A => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ARGS => [1, "2.3", "four"], B => { a => "b", c => "d", e => "f", g => "h", i => "j", k => "l", "m" => "n", o => "p", "q" => "r", "s" => "t", u => "v", w => "x", "y" => "z", }, C => "Just a big scalarJust a big scala big scalarJust a big scalarJust a big sc }

      And my benchmark code:

      #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; our $N ||= 10; sub stuff { my %hash = ( ARGS => \@_, A => [ 1 .. 10 ], B => { 'a' .. 'z' }, C => 'Just a big scalar' x 100, ); return \%hash; } my $complexData; my $start = time; for ( 1 .. $N ) { ## "fork" the subroutine my( $thread ) = async \&stuff, 1, 2.3, 'four' ; ## Do other stuff sleep 1; ## Get the complex results $complexData = $thread->join; } printf "Time taken: %.4f seconds\n", ( time() - $start ) / $N - 1; ## Display them pp $complexData;

      Care to substantiate your claim and disprove mine?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Ok, I took the bait and my faith in the general slowness of threading remains intact. Here's my results:

        $ perl thread.pl -N=100 Time taken w/ threads: 0.0077 seconds Time taken w/ forks: 0.0021 seconds

        And here's my code:

        #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; use Data::Dump qw[ dump ]; use Storable qw[ store_fd fd_retrieve ]; our $N ||= 10; sub stuff { my %hash = (ARGS => \@_, A => [1 .. 10], B => {'a' .. 'z'}, C => 'Just a big scalar' x 100,); return \%hash; } my $complexData; { my $start = time; for (1 .. $N) { ## "fork" the subroutine my ($thread) = async \&stuff, 1, 2.3, 'four'; ## Do other stuff sleep 1; ## Get the complex results $complexData = $thread->join; } printf "Time taken w/ threads: %.4f seconds\n", (time() - $start) +/ $N - 1; } my $complexData2; { my $start = time; pipe(READ, WRITE); for (1 .. $N) { my $pid = fork; if (!$pid) { # in the kid - do stuff() and send it back to parent store_fd(stuff(1, 2.3, 'four'), \*WRITE); exit; } ## Do other stuff sleep 1; ## Get the complex results $complexData2 = fd_retrieve(\*READ); waitpid($pid,0); } printf "Time taken w/ forks: %.4f seconds\n", (time() - $start) / +$N - 1; } # data should match if (dump($complexData) ne dump($complexData2)) { warn "Data did not match!"; }

        As I was coding it I realized it's kind of a bizarre benchmark since it's not really testing any concurency. It's only testing how fast a single thread/process can be spawned and send back data. And really there's just no way Perl's threads are going to beat fork() at that test!

        One neat thing I learned - I didn't realize you could use use Storable's store_fd() and fd_retrieve() to pass messages like this. I'd previously used nstore() and thaw() with a prefixed length() of the message so the other end would know how much to read. This is so much easier!

        -sam

        PS: I just noticed you're on Windows (or DOS, I guess)! You don't have a real fork() there, so I guess you're not going to be able to replicate my results. Oh well.

        Oh fine. But you'll have to wait. It does sound fun, but real work beckons...

        -sam

        --Care to substantiate your claim and disprove mine? Sure, you cannot fork on windows.
Re^2: Module for transparently forking a sub?
by kyle (Abbot) on Feb 13, 2009 at 21:51 UTC

    That is easy! Based on some of the other comments here, I think there might still be a use for a fork-based solution. If I wind up writing one, maybe I could steal the interface from threads for it.

      If I wind up writing one, maybe I could steal the interface from threads for it.

      That already exists, but it just as vulnerable to thread-safety problems as ithreads--eg. you probably won't be able to run concurrent DB queries safely; and if you're lucky enough to find thread-safe drivers, concurrent queries will likely not run any quicker than they do serially. And it is not a cross-platform solution; threads is!

      And for passing back complex data structures, serialising them through a pipe is far slower that using shared memory.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I don't think I understand your reply, so I'm wondering if I wasn't clear.

        What I meant was that if I write a module like the one I was looking for (which does with fork what you demonstrate with threads), I could use methods with the same names and calling conventions as threads (e.g., Foo->create( { context => 'list' }, \&foo ), $child->is_running(), etc.). The interface already defined for it looks nicer than what I had in mind before I read about it. Using it would also mean that someone who becomes familiar with one will have an easier time switching over to using the other one if circumstances warrant it.

      You might want to take a look at IPC::Run. It might be just what you're looking for.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://743647]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2020-08-15 08:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which rocket would you take to Mars?










    Results (78 votes). Check out past polls.

    Notices?