http://qs321.pair.com?node_id=575257


in reply to Re: comparing two files for duplicate entries
in thread comparing two files for duplicate entries

based on my original code! (:
perl -ne '/(\S*)\s+(\S*)/;(!$h{$1})?$h{$1}=$2:print "$1 $h{$1} $2\n";' file1 file2


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"one who asks a question is a fool for five minutes; one who does not ask a question remains a fool forever."

mk at perl dot org dot br

Replies are listed 'Best First'.
Re^3: comparing two files for duplicate entries
by davido (Cardinal) on Sep 28, 2006 at 03:49 UTC

    That one formats nicely. It does have a quirk with regards to always printing the "value" of the first find next to the current find for each of multiple repeats. That's a mouthful, let me demonstrate with a contrived data set:

    file1..... test1 abc test2 def test3 ghi test4 jkl file2..... test1 ghi test3 jkl test3 mno

    And the output.....

    test1 abc ghi test3 ghi jkl test3 ghi mno

    As you can see, test3's "ghi" (the first sequence found) gets repeated for each 'test3' found. Not that there's anything wrong with that. ;)

    If you use the -a switch, you will shave off a few more keystrokes from your solution though, and that's got to be worth something!

    perl -ane '($a,$b)=@F;!$h{$a}?$h{$a}=$b:print"$a $h{$a} $b\n"' file1 f +ile2

    I do like your solution since it preserves order and formats nicely. Good job.


    Dave