Re: parallel reading
by Zaxo (Archbishop) on May 09, 2006 at 13:38 UTC
|
open my $out, '>', 'ABC' or die $!;
{
local $_;
open my $A, '<', 'A' or die $!;
open my $B, '<', 'B' or die $!;
open my $C, '<', 'C' or die $!;
no warnings 'uninitialized';
while ($_ = <$A> . <$B> . <$C>) {
s/\n//g;
print $out $_, "\n";
}
}
close $out or warn $!;
That will let the files have different numbers of lines. Memory use is small, and independent of file size.
Update: Repaired the thinko blazar++ spotted. Empty lines are not a problem - we don't chomp, so they retain newlines until we s/// them gone. I like blazar's extension to different numbers of files.
| [reply] [Watch: Dir/Any] [d/l] |
|
#!/usr/bin/perl -l
use strict;
use warnings;
my @fh=map {
open my $fh, '<', $_ or die "Can't open `$_': $!\n";
$fh } @ARGV;
no warnings 'uninitialized';
print while $_=join '',
map { chomp(my $line=<$_>); $line } @fh,
__END__
However:
- you should s/undefined/uninitialized/;
it may not be fully reliable if empty lines are to be expected in the files.
Update: the second point was a thinko as Zaxo pointed out. | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: parallel reading
by blazar (Canon) on May 09, 2006 at 13:49 UTC
|
#!/usr/bin/perl -l
use strict;
use warnings;
my @fh=map {
open my $fh, '<', $_ or die "Can't open `$_': $!\n";
$fh } @ARGV;
while (@fh) {
@fh=grep !eof $_, @fh;
print map { chomp(my $line=<$_>); $line } @fh;
}
__END__
| [reply] [Watch: Dir/Any] [d/l] |
|
blazar:
Very nice (++)! I've never used map before, but that's an eye opener. It's so much better than my (admittedly terrible) hack, and clear to boot. That example is going on my "cheatsheet" of tips I keep pinned to my cube wall.
--roboticus
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
Re: parallel reading
by roboticus (Chancellor) on May 09, 2006 at 12:40 UTC
|
azaria
If you're on a *nix box, you could use the paste command, e.g.:
paste A B C
But since you asked on perlmonks, you could try something like this (terrible) program:
#!/usr/bin/perl -w
use strict;
use warnings;
open(A,"<A") or die "Can't open A!";
open(B,"<B") or die "Can't open B!";
open(C,"<C") or die "Can't open C!";
my @a = <A>;
my @b = <B>;
my @c = <C>;
while (1) {
my $fl=0;
my $aa = shift @a || "";
my $bb = shift @b || "";
my $cc = shift @c || "";
chomp $aa;
chomp $bb;
chomp $cc;
print $aa, $bb, $cc, "\n";
next if $#a + $#b + $#c;
last;
}
--roboticus
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
First thanks for your reply.
The example I gave is very shortly. The size of the input files might be change and might be big, so i guess it might influence the memory ?
azaria
| [reply] [Watch: Dir/Any] |
|
In that case I wouldn't slurp in the files all at once. My solution is one that doesn't, copes with a different number of lines per file, and with an arbitrary number of files passed on the cmd line. Shameless self ad terminated! ;-)
| [reply] [Watch: Dir/Any] |
Re: parallel reading
by graff (Chancellor) on May 09, 2006 at 12:34 UTC
|
Try putting <code> and </code> around your data samples, so that we can see what the data really look like.
What you want is what the unix "paste" command does. Someone has written a perl version of "paste" already (google for "perl power tools").
(update: in case you have trouble finding it, here's the source for a perl implementation of paste: http://ppt.perl.org/commands/paste/paste.randy) | [reply] [Watch: Dir/Any] |
Re: parallel reading
by ashokpj (Hermit) on May 09, 2006 at 13:07 UTC
|
#!/usr/local/bin/perl
open (INFILE1, "/home/ashokpj/merge1.txt") ||
die ("Cannot open input file merge1\n");
open (INFILE2, "/home/ashokpj/merge2.txt") ||
die ("Cannot open input file merge2\n");
open (INFILE3, "/home/ashokpj/merge3.txt") ||
die ("Cannot open input file merge2\n");
chomp($line1 = <INFILE1>);
chomp($line2 = <INFILE2>);
chomp($line3 = <INFILE3>);
while ($line1 ne "" || $line2 ne "" || $line3 ne "" ) {
print $line1.$line2.$line3."\n";
if ($line1 ne "") {
chomp($line1 = <INFILE1>);
}
if ($line2 ne "") {
chomp($line2 = <INFILE2>);
}
if ($line3 ne "") {
chomp($line3 = <INFILE3>);
}
}
close(INFILE1);
close(INFILE2);
close(INFILE3);
| [reply] [Watch: Dir/Any] [d/l] |
Re: parallel reading
by McDarren (Abbot) on May 09, 2006 at 13:32 UTC
|
If we can make the assumption that each file has the same number of lines, then the following should work:
#!/usr/bin/perl -w
use strict;
my %files;
my @infiles = qw(fileA fileB fileC);
for (@infiles) {
open IN, "<", $_ or die "Cannot open $_:$!\n";
chomp(@{$files{$_}} = <IN>);
close IN;
}
open OUT, ">", "fileD" or die "Cannot open fileD:$!\n";
for my $line (0 .. $#{$files{fileA}}) {
for my $file (@infiles) {
print OUT $files{$file}[$line];
}
print OUT "\n";
}
close OUT;
$ cat fileD
111AAAaaa
222BBBbbb
333CCCccc
Cheers,
Darren :) | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: parallel reading
by wfsp (Abbot) on May 09, 2006 at 12:38 UTC
|
Hi azaria!
Please advice how can i do it shortly?
The very short answer is: write some code. :-)
I would guess you need to open 3 files for input and 1 for output. Assuming fairly small input files, read the input into arrays, loop over them and build your output. Save your output to a file.
Try it and let us know how you get on. | [reply] [Watch: Dir/Any] |
Re: parallel reading
by smokemachine (Hermit) on May 10, 2006 at 02:51 UTC
|
perl -e 'for(@ARGV){open FILE,$_;chomp($a[$.-1].=$_)while<FILE>;close
+FILE}$,=$/;open FILE,">out";print FILE@a' A B C
| [reply] [Watch: Dir/Any] [d/l] |
Re: parallel reading
by whyxys (Initiate) on May 11, 2006 at 01:52 UTC
|
JAPH(just another perl approach),hehe:
perl -e 'map{chomp;$a[$i<3?$i:($i=0)].=$_;$i++}<>;print"@a\n";' filea
+fileb filec
assumption that each file has the same number of lines, here line=3 for ease | [reply] [Watch: Dir/Any] [d/l] |