Difference of array

sandy1028 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, If the take the difference of two files, one having 13423 and other 12354. The difference is around 1100(approx). But if the contents of the file if I copy to an array and
use the below code, the difference is only 200(approx). How can I get the exact count as
diff file1 file2.

@union = @intersection = @difference = ();
        %count = ();
        foreach $element (@array1, @array2) { $count{$element}++ }
        foreach $element (keys %count) {
                push @union, $element;
                push @{ $count{$element} > 1 ? \@intersection : \@diff
+erence }, $element;
                }
[download]

Comment on Difference of array Download Code

Replies are listed 'Best First'.
Re: Difference of array by CountZero (Bishop) on May 15, 2009 at 09:09 UTC
If you do not want to re-invent the wheel, look at Array::Diff and Array::Compare. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re: Difference of array by moritz (Cardinal) on May 15, 2009 at 08:36 UTC
This is the code from perlfaq4, and it also says It assumes that each element is unique in a given array Is that the case for your arrays? If not, it would explain the difference. Update: See almut's reply below, this code is nonesense. ~~(Update:) I'd just store one array in the hash, something along these lines: (untested)~~ ~~`my (@intersection, @union); my %count; @count{@array1} = undef; for (@array2) { if (exists $count{$_}) { push @union, $_; } else { push @intersection, $_; } }`~~ ~~[download]~~ (Second update): Any idea why the answer in the FAQs makes such an IMHO needless assumption? The code without that assumption isn't much longer, and although I haven't benchmarked it I don't think it's much slower either (I even think it uses less memory).	[reply] [d/l]
Re^2: Difference of array by almut (Canon) on May 15, 2009 at 09:09 UTC
Any idea why the answer in the FAQs makes such an IMHO needless assumption? I think your code doesn't compute the difference, and the union also isn't what you'd normally define as union (even if you swap `@intersection` and `@union`)... `my @array1 = qw(foo foo bar baz); my @array2 = qw(bar grmpf asdf); my (@intersection, @union); my %count; @count{@array1} = undef; for (@array2) { if (exists $count{$_}) { push @union, $_; } else { push @intersection, $_; } } use Data::Dumper; print Dumper \@intersection, \@union; __END__ $VAR1 = [ 'grmpf', 'asdf' ]; $VAR2 = [ 'bar' ];` [download]	[reply] [d/l] [select]
Re^3: Difference of array by moritz (Cardinal) on May 15, 2009 at 09:18 UTC
Yes, you're totally right. If one allows duplicates, the union is just `@union = @array1, @array2`.	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Difference of array by wol (Hermit) on May 15, 2009 at 10:11 UTC
If you're manipulating sets then take a look at Set::Scalar. -- use JAPH; print JAPH::asString();	[reply]


Your skill will accomplish what the force of many cannot
	PerlMonks