reading N files

gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have a need to read N files one line at a time and then manipulate those individual lines depending on their content. I was hoping to do something like this

use strict;
use warnings;
use English;

print "Usage: $0 output input1 input2 ...\n";

my $outfile = shift @ARGV;
open(OUT, ">$outfile") || die "Can't open $outfile for writing: $!\n";

my @infile;
foreach (@ARGV) {
   open(IN, $_) || die "can't open $_ for reading: $!\n";
   push @infile, \*IN;
}

foreach(@infile){ 
   my $i = <$_>;
   print "Got: $i";
}
[download]

where the @infile array contains all the open file handles so I could read from each one individually. Unfortunately, my test loop at the bottom seems to only read the contents of the last input file specified on the command line. I have this nagging feeling that my problem has something to do with closures (a concept I do not completely comprehend yet). What am I doing wrong?

Comment on reading N files Download Code

Replies are listed 'Best First'.
Re: reading N files by Fletch (Bishop) on Jul 19, 2006 at 14:33 UTC
`for (@ARGV ) { open( my $in, "<", $_ ) or die "Can't open '$_': $!\n"; push @infile, $in; }` [download] Presuming a recent enough Perl, of course. See perlopentut and the open docs. Older Perls you could use IO::File instead in a similar fashion.	[reply] [d/l]
Re^2: reading N files by gri6507 (Deacon) on Jul 19, 2006 at 14:44 UTC
Thank you. That was exactly it!	[reply]
Re: reading N files by polettix (Vicar) on Jul 19, 2006 at 14:42 UTC
The problem is that the `IN` you open is the same at each iteration, because it is the filehandle slot into the `IN` symbol in package `main`. In your case, it is the same as using global variables: you always access the same variable, and you keep writing on it. The solutions are in Fletch's post. Flavio perl -ple'$_=reverse' <<<ti.xittelop@oivalf Don't fool yourself.	[reply] [d/l] [select]
Re: reading N files by GrandFather (Saint) on Jul 19, 2006 at 17:36 UTC
This smacks of an XY Problem. Unless you are interleaving the files in some fashion, opening the file handles in advance does not seem like a good solution. If this is not a task requiring interleaving you might like to explain what you want to achieve so we can help with the larger problem. BTW, you should generally use the three parameter open to avoid surprises. Your two open lines change to: `open (OUT, '>', $outfile') \|\| ... and open (IN, '<' $_) \|\| ...` [download] Generally the usage line is better printed only when "required": `if (! @ARGV) { print "Usage: $0 output input1 input2 ...\n"; exit -1; }` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^2: reading N files by gri6507 (Deacon) on Jul 19, 2006 at 21:16 UTC
Actually, this was to interleave the input files. I had a problem where I had N files, each with 2 columns: a time column and a value column. I needed to interleave the N files, so that the output would have N+1 columns: one sorted time columns and N columns of either empty cells or the corresponding value. I hope this is a bit clearer.	[reply]
Re^3: reading N files by graff (Chancellor) on Jul 20, 2006 at 01:37 UTC
So, if you happened to be on a nx box (or have a windows port of standard unix utilities), you could just do a shell command (that includes a perl one-liner): `# assuming N files are named in some systematic way, # and columns are separated by whitespace: paste file.* \| perl -pe '($t)=(/^(\S+)/); s/\t$t//g;' > multi-column.f +ile` [download] The unix "paste" command takes a list of file names and concatenates them "horizontally", line by line; for a list of input files (1..N), its default behavior replaces the newline with a tab for each line of files 1..N-1. Assuming that all files in the set have the same series of values in the first column, the perl script removes all but the first occurrence of that value on each line. (If all these assumptions don't apply, then your approach of reading from a set of file handles in a loop is fine, of course.)	[reply] [d/l]
Re^4: reading N files by gri6507 (Deacon) on Jul 20, 2006 at 12:24 UTC
Re: reading N files by Solo (Deacon) on Jul 19, 2006 at 15:13 UTC
I have a need to read N files one line at a time and then manipulate those individual lines The diamond operator also works for this, and is much simpler, IMO. `use strict; use warnings; use English; print "Usage: $0 output input1 input2 ...\n"; my $outfile = shift @ARGV; open(OUT, ">$outfile") \|\| die "Can't open $outfile for writing: $!\n"; while(<>){ # $_ contains line }` [download] Simplicity is in the eye of the beholder, of course. YMMV. Update: Later posts make it clear the OP wanted to process each file's line N before moving to each file's line N+1. Obviously, the diamond operator is not very useful for that behavior. --Solo -- You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.	[reply] [d/l]


XP is just a number
	PerlMonks