http://qs321.pair.com?node_id=506517

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Greetings all, Is there a quick and dirty way to remove duplicate enteries from an array? Lets say I have an array @XYZ which contains Mary Mary Mary Mary Joe Joe Joe How can I eliminate duplicate entries? Thanks NooBie

Replies are listed 'Best First'.
Re: removing duplicate entries from an array
by Corion (Patriarch) on Nov 07, 2005 at 20:46 UTC
      Thanks
Re: removing duplicate entries from an array
by BUU (Prior) on Nov 07, 2005 at 20:47 UTC
    If you pull out your handy dandy perldoc and type 'perldoc -q duplicate', which searches for the word 'duplicate' in the FAQ section of the perldoc, you come upon this handy entry:
    ded430-deb-175-30:/home/buu/torrent# perldoc -q duplicat Found in /usr/share/perl/5.8/pod/perlfaq4.pod How can I remove duplicate elements from a list or array? There are several possible ways, depending on whether the array + is ordered and whether you wish to preserve the ordering. a) If @in is sorted, and you want @out to be sorted: (this ass +umes all true values in the array) $prev = "not equal to $in[0]"; @out = grep($_ ne $prev && ($prev = $_, 1), @in); This is nice in that it doesn't use much extra memory, simu +lating uniq(1)'s behavior of removing only adjacent duplicates. The ", 1" guarantees that the express +ion is true (so that grep picks it up) even if the $_ is 0, "", or undef. b) If you don't know whether @in is sorted: undef %saw; @out = grep(!$saw{$_}++, @in); c) Like (b), but @in contains only small integers: @out = grep(!$saw[$_]++, @in); d) A way to do (b) without any loops or greps: undef %saw; @saw{@in} = (); @out = sort keys %saw; # remove sort if undesired e) Like (d), but @in contains only small positive integers: undef @ary; @ary[@in] = @in; @out = grep {defined} @ary; But perhaps you should have been using a hash all along, eh?
Re: removing duplicate entries from an array
by Roy Johnson (Monsignor) on Nov 07, 2005 at 20:48 UTC
Re: removing duplicate entries from an array
by kirbyk (Friar) on Nov 07, 2005 at 20:51 UTC
    Quick and dirty? How about:
    my %seen; my @ABC; for my $element (@XYZ) { next if $seen{$element}++; push @ABC, $element; } @XYZ = @ABC;
    Of course, it's a huge waste to create the second array for a placeholder (even though that code is very easy to understand.) You probably want to delete the entries in place, using splice or array slices, if the array can possibly be of any length.

    -- Kirby, WhitePages.com

Re: removing duplicate entries from an array
by GrandFather (Saint) on Nov 07, 2005 at 20:51 UTC

    Here is one way:

    use strict; use warnings; my @names = qw(Mary Mary Mary Mary Joe Joe Joe); my %unique; @unique{@names} = undef; @names = keys %unique; print join ", ", @names;

    Prints:

    Joe, Mary

    Update s/undef/()/ per Roy Johnson's reply


    Perl is Huffman encoded by design.
      Just a style point: assigning a scalar to a hash slice is quirky. Save a few keystrokes and have dimensional consistency by doing @unique{@names} = () instead.

      Caution: Contents may have been coded under pressure.
      OP asked for quick & dirty :)
      print join ", ", keys %{{map {$_=>undef} @XYZ}};
        Dirtier:
        print join ', ', keys %{(grep \@{$_}{@XYZ}, {})[0]};
        ;-)

        Caution: Contents may have been coded under pressure.

        Dirtier maybe, quicker not:

        use strict; use warnings; use Benchmark qw(cmpthese); my @XYZ = qw(Mary Mary Mary Mary Joe Joe Joe); cmpthese ( -1, { 'GF' => sub {my %unique; @unique{@XYZ} = (); keys %unique;}, 'DW' => sub {keys %{{map {$_=>undef} @XYZ}};}, 'RJ' => sub {keys %{(grep \@{$_}{@XYZ}, {})[0]};}, } );

        Benchmark results:

        Rate DW RJ GF DW 71543/s -- -62% -73% RJ 188025/s 163% -- -28% GF 260808/s 265% 39% --

        Updated to include Roy Johnson's "dirtier" version


        Perl is Huffman encoded by design.
Re: removing duplicate entries from an array
by neniro (Priest) on Nov 07, 2005 at 21:49 UTC
    Yet another way using junctions:
    #!/usr/bin/perl use strict; use warnings; use Perl6::Junction qw/one/; use Data::Dumper; my @origin = (1, 2, 3, 2, 3, 4, 1, ); my @unique; for (@origin) { push @unique, $_ unless $_ == one(@unique) } print Dumper \@unique;
      timtowtdi using Set::Scalar:
      #!/usr/bin/perl use strict; use warnings; use Set::Scalar; use Data::Dumper; my @origin = (1, 2, 3, 2, 3, 4, 1, ); my $s = new Set::Scalar; $s->insert(@origin); my @unique = $s->elements; print Dumper \@unique;
Re: removing duplicate entries from an array
by Withigo (Friar) on Nov 08, 2005 at 05:21 UTC
    Here's a way that doesn't involve dirtiness:
    use List::MoreUtils qw/uniq/; my @foo = (1,1,1,2,3); my @uniques = uniq @foo;