Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Redirecting stdout/stderr to pipe

by 0xbeef (Hermit)
on Sep 19, 2005 at 11:25 UTC ( #493124=perlquestion: print w/replies, xml ) Need Help??

0xbeef has asked for the wisdom of the Perl Monks concerning the following question:

It have a program that collects output from o/s commands by forking a child process and reading its stdout/stderr seperately via a pipe. The duration and output of each exec() may not exceed a certain threshold, otherwise the exec'd child gets killed and the next system command gets run.

I have code that does this _slowly_ - maybe since I'm not buffering the reads from the pipe, but handling each line of output in terms of measuring size/duration. I cannot think of any other way of enforcing time/size thresholds on the child process... I'm hoping an enlightened monk would have a clever trick for something like this?

Conditions:
1. Child process' stdout and stderr must be processed in seperate handles.
2. exec'd child cannot run for longer than xx secs
3. exec child's stdout and stderr must not exceed x lines/size

0xbeef

Replies are listed 'Best First'.
Re: Redirecting stdout/stderr to pipe
by betterworld (Curate) on Sep 19, 2005 at 11:49 UTC
    As for the runtime restriction, you may use bash's ulimit command if you are running Unix. However, you can only restrict the cpu time with ulimit, not the idle time.
      The code act as an information collector for a system administrator. It is initiated remotely via Net::SSH::Perl, and must therefore automatically manage the maximum allowed nr. of seconds that a command may run, or the amount of stdout or stderr that can be produced.

      For example, it may not assume that "who failedlogin" will be just work, and prevent it from running for say longer than 60secs and 5000 lines of stdout/stderr.

      -0xbeef

Re: Redirecting stdout/stderr to pipe
by Roy Johnson (Monsignor) on Sep 19, 2005 at 16:07 UTC
    alarm is the usual way to set a timeout. Set it when you fork, and reset it if the child exits in time.

    Use select to poll for available data on each of your filehandles, adding to a variable to keep track of the amount of output they've produced.


    Caution: Contents may have been coded under pressure.
Re: Redirecting stdout/stderr to pipe
by Thelonius (Priest) on Sep 19, 2005 at 19:36 UTC
    I think your main problems is that select does not interact well with buffered I/O, which you use with <READERR> and <READOUT>. You should really use sysread and do a split /\n/ at the very end.

    Here's how I might do it, using IPC::Open3 and IO::Select. You should add the waitpid/close from your example:

    #!perl -w use IPC::Open3; use IO::Select; use Symbol qw(gensym); use strict; my $cmd = shift; my $g_timeout = shift || 5; my $g_maxlines = shift || 10; my @output; my @err; $| = 1; runcmd(\$cmd, \@output, \@err); print "output = \n"; print " $_\n" for @output; print "error = \n"; print " $_\n" for @err; sub runcmd { my ($cmdref, $outref, $errref) = @_; my ($childin, $childout, $childerr); $childerr = gensym; my $pid = open3($childin, $childout, $childerr, $cmd) or die "Cannot run cmd '$cmd': $!\n"; my $select = IO::Select->new or die "Cannot create select object: $! +\n"; my @hold_output; for (($childout, $childerr)) { $select->add($_); $hold_output[fileno($_)] = [0, 0, ""]; # eof, lines, buffer; } $@ = undef; eval { local $SIG{ALRM} = sub { die "alarm\n" }; my $deadline = $g_timeout + time; my $g_stop = 0; alarm($g_timeout + 1); while (!$g_stop && $select->count > 0) { $! = 0; my @ready = $select->can_read($deadline - time); if (!@ready) { $g_stop = 1; } for my $handle (@ready) { my $fno = fileno($handle); my $line; my $bytesread = sysread $handle, $line, 1024; if ($bytesread) { $hold_output[$fno][2] .= $line; $hold_output[$fno][1] += $line =~ y/\n/\n/; if ($hold_output[$fno][1] >= $g_maxlines) { $select->remove($handle); } } elsif ($!) { die "$!"; } else { $hold_output[$fno][0] = 1; #EOF $select->remove($handle); } } } alarm(0); }; if ($@) { # print STDERR "\$\@ = $@\n"; die unless $@ eq "alarm\n"; } # Note: lines may exceed $g_maxlines because of buffering # of output in the child process @$outref = split /\n/, $hold_output[fileno($childout)][2]; @$errref = split /\n/, $hold_output[fileno($childerr)][2]; }
      Okay, thanks for that! I would like however like to be enlightened more: ;)

      1. I'd rather not use a solution using external modules, since the collector will have to run on MANY hosts as painlessly and non-intrusively as possible. Is there a high performance approach without external modules?
      2. I am unsure as to how much data should be read from the pipe at a time in order to optimise throughput as much as possible...

      -0xbeef

        1. I'd rather not use a solution using external modules, since the collector will have to run on MANY hosts as painlessly and non-intrusively as possible. Is there a high performance approach without external modules?
        The modules cited in the code that Thelonius posted (IPC::Open3, IO::Select, Symbol) are not "external", in the sense that they are all included in the standard Perl "core" distribution. That is, wherever a reasonably current version of Perl is installed, these modules are also installed by default.

        (If these many hosts you speak of have non-standard or hopelessly outdated perl installations, that's going to be a problem anyway.)

        As for buffer size, the 1024 bytes suggested by Thelonius is fine for handling text that must ultimately be treated in a line-oriented fashion; if you're dealing with really high data rates, an 8k buffer should be about optimal.

Re: Redirecting stdout/stderr to pipe
by liverpole (Monsignor) on Sep 19, 2005 at 13:07 UTC
    Can you provide some of the code you've tried?  (If it's very large, try carving it down to the salient functionality).  The purpose of supplying this information is threefold -- first, it shows what you've tried or haven't tried; second, you may have a bug, or missed some subtle point which someone else can identify, and third, it gives those who might respond a starting point for thinking about a solution.
      Ok, here is the code snippet. In a perl nutshell, the parent forks a child which does the actual o/s command, and the parent continues to monitor the child's duration and output size.

      The child will redirect its output to the pipes and exec(). The parent _needs_ to acquire the child's stdout and stderr as fast as possible, and be able to handle (kill) a child running for longer than X time or larger than Y size.

      pipe READOUT,WRITEOUT; pipe READERR,WRITEERR; $g_pid = fork(); if ( $g_pid < 0 ) { # Error die "Error: fork failed with error $g_pid: $!."; $rc = 1; } elsif ( $g_pid == 0 ) { # Child close(READOUT); close(READERR); close(STDOUT); open(STDOUT,">&WRITEOUT"); close(STDERR); open(STDERR,">&WRITEERR"); close(STDIN); open(STDIN,"</dev/null"); exec($$cmdref); exit(1); } else { # Parent $g_did_run++; $g_pids{$g_pid} = 1; close(WRITEOUT); close(WRITEERR); if ( $g_timeout > 0 ) { $g_stop = 0; alarm $g_timeout; } my $r_in = ""; my $w_in = ""; vec($r_in,fileno(READOUT),1) = 1; vec($r_in,fileno(READERR),1) = 1; vec($w_in,fileno(STDOUT),1) = 1; my $e_in = $r_in | $w_in; my ($r_out,$w_out,$e_out); my $cnt=0; my $eof_e = 0; my $eof_o = 0; while (1) { my ($nfound,$timeleft) = select($r_out=$r_in,undef,$e_out=$e_ +in,1); prog_log(20,sprintf("run_cmd: select returned time [%d] found + [$nfound] timeleft [$timeleft] r_out [%s] r_err [%s] e_out [%s] e_er +r [%s]\n", time, vec($r_out,fileno(READOUT),1)?'y':'n', vec($r_out,fileno(READERR),1)?'y':'n', vec($e_out,fileno(READOUT),1)?'y':'n', vec($e_out,fileno(READERR),1)?'y':'n')); if ( vec($r_out,fileno(READERR),1) and !vec($e_out,fileno(REA +DERR),1) ) { if ( eof(READERR) ) { $eof_e = 1; } else { my $line = <READERR>; chomp $line; push @$errref, $line; } } if ( vec($r_out,fileno(READOUT),1) and ! vec($e_out,fileno(RE +ADOUT),1) ) { if ( eof(READOUT) ) { $eof_o = 1; } else { my $line = <READOUT>; chomp $line; push @$outref, $line; } } last if ( $eof_e and $eof_o ); last if ( $g_stop ); } if ( $g_stop ) { # Indicator for select loop to break $rc = 2; } else { my $loglvl = 5; my $t_pid = waitpid($g_pid,0); if ( $t_pid < 0 ) { $rc = 3; } elsif ( $t_pid == 0 ) { # Process no terminated yet! $loglvl = 1; $rc = 4; } elsif ( $t_pid == $g_pid ) { $rc = $?; } else { # Not possible! $loglvl = 1; $rc = 5; } close(READOUT); close(READERR); delete $g_pids{$g_pid}; prog_log($loglvl,"run_cmd: waitpid($g_pid) returned $t_pid - +rc $rc."); }

      Thanks in advance!
      -0xbeef

        Minor nit-pick that probably doesn't answer your immediate problem: "fork()" does not return a negative number; failure is indicated by returning undef. So your initial "if" condition should be:
        if ( ! defined( $g_pid ))
        Apart from that, you seem to be using variables with global scope for flow-control, and you didn't show how/whether these were initialized ($g_stop, $g_timeout). So it's hard to guess whether it's a problem that $g_stop is never assigned a value within the "while(1)" loop. Also, you didn't include whatever code you have (if any) that actually handles the event generated by "alarm()". Could that be relevant?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://493124]
Approved by g0n
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (1)
As of 2022-07-03 20:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?