http://qs321.pair.com?node_id=489796

monkfan has asked for the wisdom of the Perl Monks concerning the following question: (arrays)

Given this:
my @AoA = ( ['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d'], );
I wish to return simply this:
my @uAoA = ( ['a','b','c'], ['a','b','d'], );
From the original posting in SoPW, one can find many answers. One of them is (by TedPride), seems to me to be the most compact of them all:
use strict; use warnings; use Data::Dumper; my @AoA = ( ['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d'], ); my (%h, @uAoA); for (@AoA) { push @uAoA, $_ if !$h{join $;, @$_}++; } print Dumper \@uAoA;

Originally posted as a Categorized Question.

Replies are listed 'Best First'.
Re: How do I find unique Array in Array of Arrays?
by bart (Canon) on Dec 18, 2007 at 13:23 UTC
    There's a likely cheaper solution to stringify an array of strings (no nested arrays) than using Data::Dumper: if you first quotemeta every string and then join by any \W character (except '\'), then you can uniquely recognize any combination of characters in the arrays, because the join string will not have a prepending backslash, and every \W character in the data will.

    In addition, I'm using fewer extra data structures than in the other proposed solutions.

    my %seen; my @aoa = grep { not $seen{join " ", map quotemeta, @$_}++ } @AoA;
Re: How do I find unique Array in Array of Array?
by jdporter (Paladin) on May 11, 2006 at 16:34 UTC

    In order to detect duplicates, one needs a way of testing equality. For non-scalar data structures, this can be tricky, or at least application dependent.

    As a generic solution, one can simply stringify each datastructure and compare using string equality (eq). This technique probably breaks badly when any of the contents are objects or functions or other exotic beasts. For strings and numbers, it works pretty well.

    Here, I use Data::Dumper for stringification:

    my @a = ( ['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d'], ); my @b = do { use Data::Dumper; my %seen; map { $_->[0] } grep { !$seen{$_->[1]}++ } map { [ $_, Dumper($_) ] } @a };
      Since this stores the Data::Dumper output for each array in %seen as a key, wouldn't it be vastly inefficient in terms of memory use? It might be better to md5 your Dumper output before using it as a key:
      use strict; use warnings; use Data::Dumper; use Digest::MD5 qw/md5/; my @a = ( ['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d'], ); my %seen; my @b = grep { !$seen{md5 Dumper($_)}++ } @a; print Dumper(\@b);

        Good point; but as with anything, there's a time/space tradeoff, and it's the engineer's call.

        I would say that for something like your sample data, the time it takes to calculate the MD5 would not be worth it, especially given that the memory savings would be neglible.

        In really extreme cases, you'd probably want a function that could hash a complex data structure directly, rather than a stringification of it.

        We're building the house of the future together.
Re: How do I find unique Array in Array of Array?
by mohan123 (Novice) on Jun 13, 2006 at 13:32 UTC
    use strict; use warnings; use Data::Dumper; my @AoA = (['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d']); my %temp=(); my @aoa=map{[split $;,$_]}grep{++$temp{$_}<2} map {join $;,@$_} @AoA; print Dumper \@aoa;

    Janitored by Corion: Added formatting, code tags, as per Writeup Formatting Tips

Re: How do I find unique Array in Array of Arrays?
by rajesh.bodagala (Initiate) on Dec 18, 2007 at 12:22 UTC
    use strics, use Data::Dumper; my @AoA = (['a','b','c'], ['a','b','c'], ['a','b','d'], ['a','b','d'] ); my $h,$uAoA; map {$h->{join '-',@$_} = 1} @AoA; push @{$uAoA},[split '_',keys %{$h}]; print Data::Dumper $uAoA;
      That will fail if some of the array elements contain -. That's why you normally want out of band signaling.
Re: How do I find unique Array in Array of Array?
by perladdict (Chaplain) on May 11, 2006 at 06:42 UTC

    hi ,monk

    Try the following code. I can't claim any credit for this.

    #!/usr/bin/perl -w @AoA=(["1","15"], ["2","5"], ["3","4"], ["3","5"], ["3","8"], ["3","5" +], ["4","6"], ["4","5"]) ; # print out original array foreach $a (@AoA) { print "["; foreach $v (@{$a}) { print $v,","; } print "],"; } print "\n"; # now remove duplicates - see web page for explanation %temp = (); @list = grep ++$temp{join(",",@{$_})} < 2, @AoA; # now print again to see amended list foreach $a (@list) { print "["; foreach $v (@{$a}) { print $v,","; } print "],";

    Originally posted as a Categorized Answer.