Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Code and Process Efficency

by mcogan1966 (Monk)
on Dec 29, 2003 at 20:13 UTC ( #317529=perlquestion: print w/replies, xml ) Need Help??

mcogan1966 has asked for the wisdom of the Perl Monks concerning the following question:

Take the following section of code:
$pid = open($firsh_child, "-|"); if ($pid) { while (<$first_child>) { $lines .= $_; } } else { foreach (@gets) { $i = 0; $cpid[$i] = fork; unless ($cpid[$i]) { print `perl $_`; exit; } $i++; } exit; }
@gets contains a list of outside perl programs that are to be called, where the response is printed back to the main program. Everything works as it should. The forked processes merrily create their children and run concurrently. But I'm running into a performance issue.

If I run one program, running time is usually 1-3 seconds. However, if I pass a number of programs, the speed of each program drops dramatically, frequently double or more the time to run each program. So, instead of having a process take 1-3 seconds, it'll take 2-6, sometimes more. In fact, once the number of called programs exceeds 4, the programs called last tend to take 6-8 seconds to run.

I've considered the fact that this may be a memory related issue, but I don't know how to check the memory usage while this is running (UNIX environment). I'd prefer to have something that can print to STDERR while running so that I can check values while the main program runs.

Anyone out there with any code tuning experience that can give me some pointers here?

Replies are listed 'Best First'.
Re: Code and Process Efficency
by Roger (Parson) on Dec 29, 2003 at 20:24 UTC
    On a UNIX system, you could use the top utility to inspect the memory usage, if the administrator has installed it of course.

    I suspect that your machine has a single processor, which means that all running processes on the machine share the same CPU. Unless your system has multiple CPU's, forking is not going to speed up processing of your programs, but rather make them slower because of memory usage and task switching overheads.

      Unless your system has multiple CPU's, forking is not going to speed up processing of your programs, but rather make them slower because of memory usage and task switching overheads.
      Assuming each process is cpu limited.

      Having more processes doing the work gives you a bigger share of timeslice. Chances are that CPU loading is not the problem. I/O latency is a more likely culprit, whether that is from explicit IO calls or vm swapping.

      After Compline,

Re: Code and Process Efficency
by Zaxo (Archbishop) on Dec 29, 2003 at 20:51 UTC

    I think you've cut this a little too deep for publication. There is a disagreement between open and diamond about the name of the filehandle. Furthermore, open is missing its third argument. Your example dies from syntax error. Aside from that, your code could be paraphrased as

    { local $/; open my $first_child, '-|', $cmd and $lines = <$first_child>; or do { # forky stuff } }
    If you get multiple children, it is because the open call fails.

    Check whether each child uses huge hashes or arrays. Memory pressure is the likeliest cause of noticible slowdowns. Often, the cure is to rewrite the input handling.

    After Compline,

      open is not missing its third argument - the 2-argument form is perfectly valid syntax. It is usually used when you want to have more control over how the command is executed, e.g. using exec. From the perldoc:

      If you open a pipe on the command "'-'", i.e., either "'|-'" or "'-|'" with 2-arguments (or 1-argument) form of open(), then there is an implicit fork done, and the return value of open is the pid of the child within the parent process, and "0" within the child process.

      ... and taking the (slightly modified) example from perlipc:

      $pid = open(KID_TO_READ, "-|") or die "fork: $!"; if ($pid) { # parent while (<KID_TO_READ>) { # do something interesting } close(KID_TO_READ) or warn "kid exited $?"; } else { # child exec($program, @args) or die "can't exec program: $!"; # NOTREACHED }


Re: Code and Process Efficency
by sgifford (Prior) on Dec 29, 2003 at 21:26 UTC

    As others have said, use top to see if you're running out of memory. That seems the most likely culprit.

    If the amount of output from your programs is large, an improvement that would reduce memory usage would be to change the child code to:

    unless ($cpid[$i]) { exec 'perl',$_ or die "exec error: $!\n"; }
    (You might have to play around with file descriptors a little to make this work just right). Using backticks causes another fork to be done, with the resulting parent reading the child's output, storing it into a scalar variable, then when the child exits prints the results. Using exec instead saves a fork, and instead of reading all of the input into a scalar variable, just lets the child write to stdout directly.

    Also, adding better error checking, along with use strict and the -w flag might help you locate errors that are making your program behave in unexpected ways.

    How many programs are listed in @gets?

Re: Code and Process Efficency
by dominix (Deacon) on Dec 30, 2003 at 01:23 UTC
    why re-invent the weel, I propose a merlyn{my hero} like reply :-) : why don't you check this column and realise that you may have to wait for some process to finish before launching others
      But I do need to run the processes at the same time. The calls are to programs that pull certain data as selected by the user. Think of it like asking a user which reports to pull. Each sub-program generates their own report. Yeah, it could be done sequentally, but the time for that to happen is too long for the needs of this application. I'm thinking that I should be able to speed this up to run 6 reports in < 6-7 seconds. At least, that is the immediate goal.

      Also, I am investigating possible code-efficiency-issues within the individual programs, but this will be the next step once that is finished.

Re: Code and Process Efficency
by bl0rf (Pilgrim) on Dec 30, 2003 at 03:07 UTC
    I think the performance hit is due to your scripts all trying to use the CPU at the same time. Try calling the scripts sequentially, because you don't need all of them running at once (do you??).

    @scripts = ('','',''); foreach $script ( @scripts ) { system( "perl -W".$script ); }
    using system waits until one program finishes running before it executes another one.

      Actually, the scripts do need to run at the same time, hence the code as it was written. I wouldn't be trying to fork off multiple processes if I didn't need them to run at the same time.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://317529]
Approved by duct_tape
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2022-06-26 05:45 GMT
Find Nodes?
    Voting Booth?
    My most frequent journeys are powered by:

    Results (83 votes). Check out past polls.