Re^2: comparing two files for duplicate entries

in reply to Re: comparing two files for duplicate entries
in thread comparing two files for duplicate entries

based on my original code! (:
perl -ne '/(\S*)\s+(\S*)/;(!$h{$1})?$h{$1}=$2:print "$1 $h{$1} $2\n";' file1 file2

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"one who asks a question is a fool for five minutes; one who does not ask a question remains a fool forever."

mk at perl dot org dot br

Comment on Re^2: comparing two files for duplicate entries Download Code

Replies are listed 'Best First'.
Re^3: comparing two files for duplicate entries by davido (Cardinal) on Sep 28, 2006 at 03:49 UTC
That one formats nicely. It does have a quirk with regards to always printing the "value" of the first find next to the current find for each of multiple repeats. That's a mouthful, let me demonstrate with a contrived data set: `file1..... test1 abc test2 def test3 ghi test4 jkl file2..... test1 ghi test3 jkl test3 mno` [download] And the output..... `test1 abc ghi test3 ghi jkl test3 ghi mno` [download] As you can see, test3's "ghi" (the first sequence found) gets repeated for each 'test3' found. Not that there's anything wrong with that. ;) If you use the -a switch, you will shave off a few more keystrokes from your solution though, and that's got to be worth something! `perl -ane '($a,$b)=@F;!$h{$a}?$h{$a}=$b:print"$a $h{$a} $b\n"' file1 f +ile2` [download] I do like your solution since it preserves order and formats nicely. Good job. Dave	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: comparing two files for duplicate entries
by davido (Cardinal) on Sep 28, 2006 at 03:49 UTC

That one formats nicely. It does have a quirk with regards to always printing the "value" of the first find next to the current find for each of multiple repeats. That's a mouthful, let me demonstrate with a contrived data set:

file1.....

test1 abc
test2 def
test3 ghi
test4 jkl

file2.....
test1 ghi
test3 jkl
test3 mno
[download]

And the output.....

test1 abc ghi
test3 ghi jkl
test3 ghi mno
[download]

As you can see, test3's "ghi" (the first sequence found) gets repeated for each 'test3' found. Not that there's anything wrong with that. ;)

If you use the -a switch, you will shave off a few more keystrokes from your solution though, and that's got to be worth something!

perl -ane '($a,$b)=@F;!$h{$a}?$h{$a}=$b:print"$a $h{$a} $b\n"' file1 f
+ile2
[download]

I do like your solution since it preserves order and formats nicely. Good job.

Dave

[reply]
[d/l]
[select]

In Section Seekers of Perl Wisdom