Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Streaming to Handles

by crabbdean (Pilgrim)
on May 01, 2004 at 07:40 UTC ( [id://349582]=perlquestion: print w/replies, xml ) Need Help??

crabbdean has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing code that attempts to output data to a filehandle, either STDOUT or whatever is specified. I want it to stream the output so that in your calling script you can work with each line of output as it comes out. The below code gives the guts of everything (more a watered-down version so you get an idea of my problem.)

My problem is that my module that does all the work prints the output to the specified HANDLE, but you can't work with the output. eg. If you output to LOGFILE in my calling the normal structure of ...
while (<LOGFILE>) { ## do something with $_ }
OR
foreach ($f->list) { ## do something with $_ }
... doesn't work. The reason being is that my module outputs straight to the output file handle insteading of returing it to my main calling script.

In the example given you can see the section I've marked as "##problem??" doesn't output each entry with a newline character. The output is being directly spat out from the module.

How do I capture the output as an input stream to my calling program? What am I doing wrong? Is this making sense? (see code in the readmore section)

My calling program is this:
#!perl use strict; use warnings; push (@INC,'F:/dean/scr/pl/pms/dirlist'); require List; my $f = List->new; $f->stream_to(\*STDOUT); $f->look_in('c:/'); foreach ($f->list) { ##problem?? print "$_\n"; }
The module is this:
package List; use strict; use warnings; use vars qw(@ISA @EXPORT_OK $VERSION); use File::Spec qw(catfile); use Carp; use Cwd qw(cwd abs_path); require Exporter; @ISA = qw(Exporter); @EXPORT_OK = qw(list); $VERSION = '0.99'; ###################################################################### # object interface sub new { my ($class, $stream, $path, ) = @_; $stream = \*STDOUT unless @_ > 1; $path = cwd unless @_ > 2; my $self = { stream => $stream, path => File::Spec->canonpath($path), }; bless $self, $class; return $self; } sub list { my ($self, $dir, $level) = @_; $dir = $self->{path} unless @_ > 1; $level = 0 unless @_ > 2; if ( opendir( DIR, $dir)) { $level++; map { my $fullfile = File::Spec->catfile( $dir, $_ ); if (-d $fullfile) { $self->list ( $fullfile, $level ); } my $str = $self->{stream};# ref to the stream print $str $fullfile; ## problem?? } File::Spec->no_upwards( readdir( DIR ) ); } else { warn "$dir : opendir failed: $!\n"; } close DIR; } sub stream_to{ my ($self, $strm) = @_; $strm = \*STDOUT unless @_ > 1; $self->{stream} = $strm; } sub look_in { my ($self, $path) = @_; $path = cwd unless @_ > 1; $self->{path} = $path; }

Dean
The Funkster of Mirth
Programming these days takes more than a lone avenger with a compiler. - sam
RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers

Replies are listed 'Best First'.
Re: Streaming to Handles
by matija (Priest) on May 01, 2004 at 09:35 UTC
    You want to write into a file in the module and then read it the calling program and write a bit more and read a bit more? That is NOT going to work. Not with STDOUT, not with pipes, not even with sockets.

    It is not the propper way to communicate with the subroutine. Why don't you have the subroutine simply return a bunch of lines together in a string (or an array), and then process them in the calling program as you see fit? That will work much better.

      I've already created it and got it working passing back an array, which you then can loop through as normal. Problem is in my case the array is going to be returning a MASSIVE amount of data, we are talking of a processing time of about 6 hours. Unless I'm on a supercomputer with a crap load of RAM I'd have to chop that array into pieces and pass it back piecemeal or stream it. Plus streaming is faster I believe. There has to be a way because streaming is possible, I'm just not familiar with the syntax. I've seen modules where people write to a stream and then read the FILEHANDLE on the stream. Help! Please?

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
        The only way I can think of doing this without heavily restructuring your code is to create a pipe, fork off a child, then have the child write the list of filenames to the pipe and have the parent read them back in.
Re: Streaming to Handles
by zentara (Archbishop) on May 01, 2004 at 14:25 UTC
    Just some brainstorming....

    Would it be possible to use the "variable as a filehandle trick"? You could accesss the variable's value at any time.

    #!/usr/bin/perl my $foo = ''; open FILEHANDLE, '+>', \$foo or die $!; my $count = 0; while(1){ print FILEHANDLE $count; print "$foo\n"; select(undef,undef,undef,.1); $count++; } close FILEHANDLE or die $!;

    I'm not really a human, but I play one on earth. flash japh
      I still haven't found a great solution to this problem. Forking to me seems overly complicated implementation and doesn't create a simple interface for users.

      This solution suggested above to me seems the closest to what I can imagine. But two things: In this example your print to the FILEHANDLE and the bound variable are in the same program. I've attempted to seperate them into a script and a seperate module method but was unsuccessful. Are you able to demonstrate the above in this way. That would be really helpful.

      Secondly I had another thought along similar lines: I could create a object variable in my object like this:
      # object interface sub new { my ($class, $stream, $path, ) = @_; my $file = undef; $path = cwd unless @_ > 2; my $self = { file => \$file, path => File::Spec->canonpath($path), }; bless $self, $class; return $self; }
      and then in my script make reference to this variable like this:
      my $f = List->new; $f->look_in('c:/'); foreach ($f->list) { ##problem?? print "${$f->{file}}\n"; }
      Now I know straight up just looking at it that this code won't work in the foreach statement but is anyone getting the gist of what I'm attempting. In my program you can instantly dereference this $f->{file} at any time to access the file name it is currently working on. But how does the script know when this piece of memory changes value to rewrite it? Do you just call it in a "while" loop. Hmmmm ... the idea has merit but I'm not sure how to implement this.

      Surely there must be a way to stream ... how is it done? Somebody .. anybody ... are you out there hearing my cries for help? How do you do this? There is a way, I just need to know how!

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
        two things: In this example your print to the FILEHANDLE and the bound variable are in the same program. I've attempted to seperate them into a script and a seperate module method but was unsuccessful. Are you able to demonstrate the above in this way. That would be really helpful.

        I'm not the best person to ask for help on modules, I basically understand the concept, but I still think like an old functional programmer. My first thought on putting the filehandle in the module, is to use "our" instead of "my" on the variable, or export the variable from the module.

        This would make a nice separate question to ask, "how to use a module variable as a filehandle, and export it to the main program".


        I'm not really a human, but I play one on earth. flash japh
Re: Streaming to Handles
by Stevie-O (Friar) on May 01, 2004 at 18:52 UTC
    If what you're doing is fork-friendly (i.e. doesn't involve modifying global variables), Perl has a built-in syntax just for you -- with open. (This sort of thing, by the way, is why I occasionally reread the perl PODs.)
    #foo() is the sub that outputs a lot of data my $pid = open(SUBOUT, '-|'); defined($pid) or die "Fork failed: $!"; if ($pid == 0) { $|=1; foo(); exit(0); } while (<SUBOUT>) ...
    The open will fork your script, and the forked copy (the one that calls the function) will have its STDOUT piped to the filehandle given to open() (in this case, SUBOUT).
    --Stevie-O
    $"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc
Re: Streaming to Handles (iterator)
by tye (Sage) on May 05, 2004 at 23:26 UTC

    You don't need a stream; you want an iterator (yes, similar term). To turn this into an iterator in Perl5, you need to keep your own "stack". That is easy to do with an anonymous array (or two) inside your object.

    I put file names that I have yet to output into @{ $self->{files} } and output the next one from there the next time the iterator is called. I put directory names that I have yet to read the list of files from into @{ $self->{dirs} } and when there aren't any more file names to return, I read the next directory.

    First, here is how you'd use my iterator:

    #!/usr/bin/perl use strict; use warnings; require List; my $f= List->new( @ARGV ); my $file; while( $file= $f->next() ) { print "$file\n"; }

    And here is the code that implements it:

    package List; # Terrible name use strict; use warnings; use Cwd qw( cwd ); require File::Spec; use vars qw( $VERSION ); $VERSION = '0.99'; sub new { my( $class, $path )= @_; my $self= { }; if( defined $path ) { $self->look_in( $path ); } bless $self, $class; return $self; } sub look_in { my( $self, $path )= @_; $path= cwd() unless @_ > 1; $path= File::Spec->canonpath($path); $self->{path}= $path; $self->{dirs}= [$path]; $self->{files}= []; } sub next { my( $self )= @_; while( 1 ) { if( @{ $self->{files} } ) { my $file = shift @{ $self->{files} }; if( -d $file ) { push @{ $self->{dirs} }, $file; } return $file; } if( ! @{ $self->{dirs} } ) { return; } my $dir= shift @{ $self->{dirs} }; if( opendir( DIR, $dir ) ) { $self->{files}= [ map { File::Spec->catfile( $dir, $_ ); } File::Spec->no_upwards( readdir(DIR) ) ]; closedir DIR; } else { warn "opendir failed, $dir: $!\n"; } } } 1;

    I tested it enough to see that it appears to work just fine.

    If you had directories with huge numbers of files directly in them (not in subdirectories), then you might want to make the iterator a bit more complicated such that you don't keep a list of file names and instead return each file name (almost) immediately after you get it back from readdir (but I'm not sure I would recommend that).

    - tye        

      Thanks, I haven't tested this but looking at it, it "makes sense" and on appearance appears to be what I'm looking for. Thanks. BIG GRINS!! ++ I'll test it in the coming days and let you know. I'll post back with my findings/results

      By the way, the package name "List" was only used for this example although I'm unsure of what to call the module. I'm assuming it will come under the "File::" modules and could call it "list" or "DirList" or something. Do you have any good suggestions for a name?

      Thanks once again. :-)

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
      I've written this into my code and it works perfectly! :-) I'll tweak it a bit and get it working with the other features in my module. But that's exactly what I was after. Big ++ !!!

      I'm just running a benchmark now to see a comparasion against the alternative solution of returning files as arrays.

      I intend to leave both methods in the module so it gives the user the choice of streaming or returning via arrays.

      Here are the benchmark results:
      Rate stream array stream 76.1/s -- -14% array 88.7/s 17% -- Rate stream array stream 79.8/s -- -5% array 83.9/s 5% -- Rate stream array stream 72.2/s -- -10% array 80.2/s 11% --
      As you can see returning via array's is faster.

      Once again a big thank you!

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
Re: Streaming to Handles
by bart (Canon) on May 06, 2004 at 03:29 UTC
    It's unclear to me what you're actually after, but I have a feeling you're trying to implement coroutines. It is, basically, a form of threading, where two threads run independently, and the output of one is the input to the other, and the former one is slowed down when the latter one can't follow. I'm sorry, Perl5 doesn't do coroutines. Perhaps Perl6 might, I recall having heard the term fall on the Perl6 mailing lists, when I still followed the discussion there, years ago.

    Like some of the other people have suggested, if you don't share variables between the two threads apart from this stream, I'd separate them into two programs, and pipe the output of the former into the input of the latter. The following works on Linux, but not on Windows98. I think it works pretty much like you envisioned:

    #!/usr/bin/perl -w if(my $pid = open STDOUT, "|-") { local $\ = "\n"; print for 1 .. 100; close STDOUT; } elsif (!defined $pid) { die "Cannot fork: $!"; } else { while(<STDIN>) { chomp; print "I got: '$_'\n"; } }
    This prints 100 lines like these:
    I got: '1'
    I got: '2'
    ...
    I got: '98'
    I got: '99'
    I got: '100'
    
    Don't forget to close STDOUT in the, eh, "parent" — the first branch, or it'll hang.

    For some more info on this and related modes for open, see the docs on open, perlopentut and perlipc.

    If this has to run on Windows, and the above doesn't work (it just might on NT/XP), you can use open2(), from IPC::Open2 (or open3() from IPC::Open3), and have the script launch itself. For an example script that works this way, see Win32::GUI Chatterbox client, in particular, the sub initServer(), where the programs launches a copy of itself with open2() with an equivalent command list (as for system) of ($^X, $0, $flags) — $^X is the name of the perl executable, $0 the name of the script, and $flags a special command line switch ("-s") to make the launched script behave differently — see the lines that test

    if($opt_server) {
    .
    HTH.
      Thanks for taking the time with that reply bart. Problem is (which I hate) the "|-" syntax isn't supported on Win32 systems. A real pain in the ar$3. I'm not sure how a similar structure/syntax is achieved in Win32. Anyone?

      Coroutines sounds VERY interesting and I'd be interested in reading more into that, although at present it sounds like its still in the experimental stage (is that true?) and I'd rather stick with a solution that is (eh hem..) "standard" perl.

      Piping is probable although implementation into a module isn't so easy to achieve. Infact, based on the responses in this discussion nah it be too difficult or impossible. Hence your solution shows me an example of piping, which I can do, but not how to implement this into my problem. In this case the context of the problem puts a spin on the ability of piping out of its normal context, hence my post to the group. In general the whole piping solution to me doesn't seem "intuitively simple" if you get my meaning.

      Thanks also for all the links at the bottom. I'm interested to have a look through some of the code examples suggested. So thanks. ++

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
OT Re: Streaming to Handles
by kappa (Chaplain) on May 01, 2004 at 17:06 UTC
    Not really an answer, sorry.

    BTW, such tasks are exactly what coroutines will be in Perl6 for :)

    You would just yield the next line from your sub into the caller for her to process it in a loop and then call the sub for the next line and so on.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://349582]
Approved by DigitalKitty
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-23 22:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found