Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Loop through 2 files in parallel

by Anonymous Monk
on Nov 10, 2010 at 19:42 UTC ( [id://870665]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Okay, help me wrap my brain around this problem. I got 2 files. Here's one:
A B C D E F G
Here's the other one.
B D E G
I want this output.
blank line goes here!! B blank line goes here!! D E blank line goes here!! G
So far I got this.
#!/usr/bin/perl #Usage use strict; use warnings; my $filename1 = $ARGV[0]; my $filename2 = $ARGV[1]; open FILE, $filename1 or die $!; open FILE2, $filename2 or die $!; while(<FILE>) { while(<FILE2>) { #Magic happens. Loop through 2 files in parallel, somehow making the +1st loop advance to the next line when it matches the text in the sec +ond loop's line. } }
How do I get it to do that kind of magic? I think I need some commands to maybe, give a name to the first and second loops. Also, I need a command to tell it to go to the next iteration of the whichever loop. Or, you could tell me I am going about this all wrong and there is a way better way to do it. That would be great too.

Replies are listed 'Best First'.
Re: Loop through 2 files in parallel
by GrandFather (Saint) on Nov 11, 2010 at 03:17 UTC

    Assuming file 1 is always a superset of file 2 the following does the trick:

    #!/usr/bin/perl use strict; use warnings; my $str1 = <<STR; A B C D E F G STR my $str2 = <<STR; B D E G STR open my $in1, '<', \$str1; open my $in2, '<', \$str2; while (! eof $in1) { my $line2 = <$in2>; my $line1; print "blank line goes here!!\n" while defined ($line1 = <$in1>) && (! defined ($line2) || $line1 ne $line2); print $line2 if defined $line2; }

    Prints:

    blank line goes here!! B blank line goes here!! D E blank line goes here!! G
    True laziness is hard work
Re: Loop through 2 files in parallel
by Roy Johnson (Monsignor) on Nov 10, 2010 at 20:16 UTC
    You definitely don't want to nest your file-reading loops. Instead, try something like this (untested):
    while (my $w1 = <FILE> and my $w2 = <FILE2>) { if ($w1 lt $w2) { print "\n"; $w1 = <FILE>; } elsif ($w1 gt $wt) { print "\n"; $w2 = <FILE2>; } else { print $w1; } } # Leftover lines in either file? print "\n" while <FILE>; print "\n" while <FILE2>;

    Caution: Contents may have been coded under pressure.
      By george, I think this will work. You are truly a monktastic monk! Thanks!
        at least one error in previous code ... while condition is wrong.
        use strict; use warnings; my @FILES; open $FILES[0], '<', 'file1' or die; open $FILES[1], '<', 'file2' or die; my @w = map scalar <$_>, @FILES; my $empty = "blank line\n"; my %map = ( -1 => [ 0 ], 1 => [ 1 ], 0 => [ 0, 1 ], ); while (defined $w[0] && defined $w[1]) { my $cmp = $w[0] cmp $w[1]; print $cmp ? $empty : $w[0]; @w[ @{$map{$cmp}} ] = map scalar <$_>, @FILES[ @{$map{$cmp}} ]; } # Leftover lines in either file? print map $empty, map <$_>, @FILES;
Re: Loop through 2 files in parallel
by sherab (Scribe) on Nov 10, 2010 at 20:04 UTC
    It appears to me that you have two arrays and where something is in common, you want those elements printed with elements of only one occurance having a blank line? Just trying to figure out exactly what you're trying to do.
      Sorry, I probably wasn't clear the first time. What I have is two very similar files, but one file is missing some lines. I want to make the files have the same number of lines, but I don't want to scroll through the two files in two different text editors, pushing enter where the lines don't match. Does that make sense? Basically, the first file has a complete list of things and the second one is missing some, but there's nothing there to show which ones are missing. I want to put in a blank line where there is something missing from the second list.
Re: Loop through 2 files in parallel
by rwitmer (Initiate) on Nov 10, 2010 at 20:10 UTC
    Okay, I'm logged in now. Posted this question anonymously by accident. I found some answers here:

    http://docstore.mik.ua/orelly/perl/prog3/ch04_04.htm
    4.4.4. Loop Control

    But I still don't understand how to use the label and next commands to solve my problem. I think it would go kind of like this:
    Start loop A, get a line. Start loop B, get a line. B doesn't match A so print a blank line. Go to the next iteration of loop A. Start loop B, get a line. Now A and B match, so print out B. Go to the next iteration of loop A. Start loop B, get a line.
    I think that's how the whole "next" idea works. But what I really want is not that. I don't want it to start loop B over every time. I want it to hold its place in loop B. Can I do that?
Re: Loop through 2 files in parallel
by AndyZaft (Hermit) on Nov 10, 2010 at 20:04 UTC
    while ($line_f1 = <FILE>) { while($line_f2 = <FILE2>) { # I'm sure the rest you can do it } seek(FILE2,0,0); }
    could be one solution. If it was me however, I would probably read the smaller file into an array or another structure and just go through the other one, instead of reading lines from the second loop as many times as the amount of lines the first one has.
      The Perl Cookbook might be helpful here "Finding elements in one array but not another".I'm with AndyZaft, putting each of these into an array and then doing your logic is the way to go.
        This is a very good idea (the hash keys) but it won't work in my situation because the elements of the list are not unique. They can occur multiple times in the list, so how far along in the list you are makes a difference. I think you're right about the array. If I put each file into an array I can keep track of where I left off and move around. I was just trying to figure out how to do it without using an array because the lists are pretty long, 20K lines each. I guess that isn't much for Perl, but it would be better for keeping what's loaded into memory down if it wasn't all put into an array at once.
Re: Loop through 2 files in parallel
by aquarium (Curate) on Nov 11, 2010 at 00:04 UTC
    if this was not just an excercise in perl..i'd do it using some combination of unix/linux utilities that made sense, e.g. diff, join..etc.
    the hardest line to type correctly is: stty erase ^H
Re: Loop through 2 files in parallel
by aquarium (Curate) on Nov 11, 2010 at 04:04 UTC
    assuming you want any missing letter in the well defined sequence A-Z in either file to be printed as a blank line. untested
    @arr1=<FILE>; @arr2=<FILE2>; for $letter(A..Z) { if(grep($letter, @arr1) or grep($letter,@arr2) { print "$letter" } print "\n"; }
    the hardest line to type correctly is: stty erase ^H

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://870665]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-19 12:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found