Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^4: Simple comparison of 2 files

by pryrt (Abbot)
on Jul 27, 2016 at 19:30 UTC ( [id://1168674] : note . print w/replies, xml ) Need Help??


in reply to Re^3: Simple comparison of 2 files
in thread Simple comparison of 2 files

and the next issue is that you're only looping thru the second file once; to be able to compare every line of file1 against every line of file2, you're going to need to parse the second file multiple times (ie, nested loop); alternatively, grab all of file2 into an appropriate data structure, and compare each line of file1 to each entry in the file2 data structure (or vice versa, or both into data structures, then just compare the data)

update1

#!perl open (FILE1, $ARGV[0]); while ($_ = <FILE1>) { chomp; @FILE1 = split; ($FILE1letter, $FILE1number) = @FILE1; open (FILE2, $ARGV[1]); while( $two = <FILE2> ) { @FILE2 = split(' ',$two); ($FILE2letter, $FILE2number) = @FILE2; # print "$FILE1letter from FILE1 with number $FILE1numbe +r and $FILE2letter from FILE2 with $FILE2number match\n"; # prints the same as below if ($FILE1letter eq $FILE2letter) { print "$FILE1letter from FILE1 with number $FILE1number an +d $FILE2letter from FILE2 with number $FILE2number match\n"; } else { print "$FILE1letter from FILE1 with number $FILE1number an +d $FILE2letter from FILE2 with number $FILE2number DO NOT match\n"; } } close (FILE2); } __END__ C:\Temp>perl ab.pl a b A from FILE1 with number 1_1 and A from FILE2 with number 2_1 match A from FILE1 with number 1_1 and B from FILE2 with number 2_2 DO NOT m +atch A from FILE1 with number 1_2 and A from FILE2 with number 2_1 match A from FILE1 with number 1_2 and B from FILE2 with number 2_2 DO NOT m +atch B from FILE1 with number 1_3 and A from FILE2 with number 2_1 DO NOT m +atch B from FILE1 with number 1_3 and B from FILE2 with number 2_2 match C from FILE1 with number 1_4 and A from FILE2 with number 2_1 DO NOT m +atch C from FILE1 with number 1_4 and B from FILE2 with number 2_2 DO NOT m +atch

Replies are listed 'Best First'.
Re^5: Simple comparison of 2 files
by Q.and (Novice) on Jul 27, 2016 at 19:38 UTC
    Excellent, this is exactly the issue I was dealing with- thank you!

      alternately, to avoid len(file1) x len(file2) loops,

      use autodie; use warnings; use strict; my (@data1, @data2) = (); my ($fh, $l, $n); open $fh, "<", $ARGV[0]; while(<$fh>) { ($l, $n) = split; push @data1, [ $l, $n ]; } close($fh); open $fh, "<", $ARGV[1]; while(<$fh>) { ($l, $n) = split; push @data2, [ $l, $n ]; } close($fh); foreach my $row1 ( @data1 ) { foreach my $row2 ( @data2 ) { my ($l1, $n1, $l2, $n2) = (@$row1, @$row2); my $match = $l1 eq $l2; print "$l1 from FILE1 with number $n1 and $l2 from FILE2 with +number $n2" . ($match ? '' : " DO NOT") . " match\n"; } } __END__ A from FILE1 with number 1_1 and A from FILE2 with number 2_1 match A from FILE1 with number 1_1 and B from FILE2 with number 2_2 DO NOT m +atch A from FILE1 with number 1_2 and A from FILE2 with number 2_1 match A from FILE1 with number 1_2 and B from FILE2 with number 2_2 DO NOT m +atch B from FILE1 with number 1_3 and A from FILE2 with number 2_1 DO NOT m +atch B from FILE1 with number 1_3 and B from FILE2 with number 2_2 match C from FILE1 with number 1_4 and A from FILE2 with number 2_1 DO NOT m +atch C from FILE1 with number 1_4 and B from FILE2 with number 2_2 DO NOT m +atch

      this will save a lot of time if the files are significantly larger than 4 and 2 lines, respectively. though it will end up using more memory...

        Reading file1 into memory doesn't save anything. Reading file2 into memory does. The "expensive" operation is the line by line text read of the input file. Saving the split from File1 is an idea, but not necessary since each line from File1 need only be read and dealt with once as per my code.