http://qs321.pair.com?node_id=631358

isync has asked for the wisdom of the Perl Monks concerning the following question:

I need to query the wisdom of the perl monks...

I've got a script which I switched over to Parallel::ForkManager yesterday and after a flawless start I begin to see where the little troubles are.

I am using a "global" hash to store some data. As long as the for loop was kept inside the script on runtime everything went normal. But as soon as I used the ForkManager, each loop is its own entity (as I understand it) and the test for exist of certain keys always fails. Is that observation true?

How am I able to let a (detached/forked) sub access data from the initiating parent process/script. (Maybe its just a bug I missed and it should work out of the box...) Or is my only chance to implement a simple communication system. Any hints for that? Modules, tips?
my %hash = (test => 1); my %otherhash; use Parallel::ForkManager; my $pm = new Parallel::ForkManager(5); for(1..10000){ my $pid = $pm->start() and next; # some operations involving waiting etc. if(exists($hash{test}){ # this exist test always fails! } # write some data to otherhash $otherhash{test} = 123; $pm->finish(); } # the data 123 never arrives here...: print $otherhash{test}; ...
  • Comment on Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
  • Download Code

Replies are listed 'Best First'.
Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by Zaxo (Archbishop) on Aug 08, 2007 at 18:19 UTC

    The child gets a copy of the parent's vars but can never change the parent's copy. That's fundamental unix.

    The parent can read data from a child through some IPC mechanism and make the change itself, but that won't change the copies already-launched children have. Yes, you need IPC.

    Sysv IPC has a shared-memory facility which would seem to be just what you want, but sysv IPC is buggy with races and is rarely used.

    After Compline,
    Zaxo

Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by clinton (Priest) on Aug 08, 2007 at 18:23 UTC
    Because forks are separate processes (as opposed to threads), they cannot access each other's variables, so you will need some scheme to communicate between them.

    The most appropriate communication scheme depends on how you want to use the data. It may involve opening pipes to communicate between the processes (see IPC::Open2) or access to shared memory space (eg IPC::MMap), or maybe communicating via a file or database, or even signals.

    Note, I haven't used any of these modules, so your milage may vary. But search CPAN for IPC, and read perlipc for ideas.

    Clint

Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by isync (Hermit) on Aug 08, 2007 at 18:32 UTC
    Thanks!

    "The child gets a copy of the parent's vars but can never change the parent's copy."
    explains why the passing of vars *into* the child works, but not the passing of results *out of* the child to parent. --Fundamental, but it wasn't yet clear to me...

    I will now go to the study and look into various IPC solutions....
Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by BrowserUk (Patriarch) on Aug 08, 2007 at 21:30 UTC
      I've got fairly heavy computations which I fork off. And in http://www.perlmonks.org/?node_id=494032 it says that its better to use real procs instead of threads on heavy things.

      Can somebody explain, why!? Should I revert to threads?
        I've got fairly heavy computations which I fork off. And in http://www.perlmonks.org/?node_id=494032 it says that its better to use real procs instead of threads on heavy things.

        I've got a neighbour who boasted for years that she didn't have a mobile phone: didn't want one; didn't need one; they were just pointless technology, just for kids to waste money on ringtone downloads; another of modern life's annoyances to disturb her in restaurants and theatres.

        Turns out she has been hyperoptic for years and refuses to wear glasses. Then someone bought her a mobile with large, clear buttons and she's never without it.

        Perl's threads are imperfect, heavy and require a particular way of working to get the best out of them. But if you need to share data between contexts of execution, they are usually far easier to program and use than the alternatives. They do not lend themselves well to encapsulation, but for many applications that doesn't matter because the infrastructure code required is minimal.

        If you have a brief description of the application, I'd be happy to suggest/post a starting point using threads and you can decide for yourself whether they are suitable for your purposes.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by isync (Hermit) on Aug 08, 2007 at 19:03 UTC
    For the interested: Currently I am test-driving IPC::Shareable to tie persistent variables in shared memory.
      IPC::Shareable is kind of slow, but it may not matter for your data. If you run into trouble with it, MLDBM::Sync is an easy way to share a hash on disk.
        Slow!? I thought it is fast, because everything is held in memory... (the on-disk approach is what I am trying to circumnavigate here...)
Re: Parallel::ForkManager and vars from the parent script - are they accessible? (or do I need inter process communication?)
by isync (Hermit) on Aug 09, 2007 at 11:05 UTC
    Current progress: I put the IPC:Sharable aside as I ran into locking and garbage_collection issues - while the script became more and more unmanageable.

    That's why I set up a simple XMLRPC server for the childs which acts as central memory. Looks good.