Find unique elements in an array

Replies are listed 'Best First'.
Re: Find unique elements in an array by kvale (Monsignor) on Apr 04, 2004 at 07:52 UTC
Here is a faster solution, using a hash slice: `use Benchmark qw(:all) ; my @a; push @a, int (rand(10)) foreach 1..100; my %unique; my @awd; cmpthese($1000, { 'jc' => sub { foreach my $thingy (@a) { $unique{$thingy} = 1 +; } @awd = keys %unique; }, 'mk' => sub { @unique{ @a} = 1; @awd = keys %unique; }, });` [download] The results are `Benchmark: running jc, mk for at least 3 CPU seconds... jc: 3 wallclock secs ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 53 +5.67/s (n=20522) mk: 4 wallclock secs ( 3.27 usr + 0.00 sys = 3.27 CPU) @ 14 +937.00/s (n=48844) Rate jc mk jc 6536/s -- -56% mk 14937/s 129% --` [download] Update: `$1000` is a typo should be `1000`. I'll keep it as is for comparisons with the following nodes. Thanks BrowserUK and tilly! -Mark	[reply] [d/l] [select]
Re: Re: Find unique elements in an array by tilly (Archbishop) on Apr 04, 2004 at 20:25 UTC
Calling undef on a hash slice is faster still. `use Benchmark qw(:all) ; my @a; push @a, int (rand(10)) foreach 1..100; my %unique; my @awd; cmpthese($1000, { 'jc' => sub { foreach my $thingy (@a) { $unique{$thingy} = 1 +; } @awd = keys %unique; }, 'mk' => sub { @unique{ @a} = 1; @awd = keys %unique; }, 'bt' => sub { undef(@unique{@a}); @awd = keys %unique; }, });` [download] gives me `Rate jc mk bt jc 16556/s -- -54% -62% mk 36080/s 118% -- -18% bt 44007/s 166% 22% --` [download] (Incidentally that is a truely bizarre indent style.)	[reply] [d/l] [select]
Re: Re: Re: Find unique elements in an array by BrowserUk (Patriarch) on Apr 04, 2004 at 22:28 UTC
What is the purpose of using undef as the iteration count for `cmpthese( $1000, ... );`? Is that what causes the warnings? Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 831. Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 838. Use of uninitialized value in numeric eq (==) at c:/Perl/lib/Benchmark +.pm line 776. Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 840. Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 791. Use of uninitialized value in numeric eq (==) at c:/Perl/lib/Benchmark +.pm line 776. Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 791. Use of uninitialized value in numeric eq (==) at c:/Perl/lib/Benchmark +.pm line 776. Use of uninitialized value in numeric gt (>) at c:/Perl/lib/Benchmark. +pm line 791. Use of uninitialized value in numeric eq (==) at c:/Perl/lib/Benchmark +.pm line 776. [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l] [select]
Re: Re: Re: Re: Find unique elements in an array by tilly (Archbishop) on Apr 05, 2004 at 04:25 UTC
Re: Re: Re: Re: Re: Find unique elements in an array by BrowserUk (Patriarch) on Apr 05, 2004 at 05:33 UTC
Re: Re: Find unique elements in an array by ysth (Canon) on Apr 04, 2004 at 13:35 UTC
saying `= ()` instead of `= 1` is slightly faster, but the hash slice approach may not scale well with large @a's, since the whole array needs to be placed on the stack at once.	[reply] [d/l] [select]
Re: Re: Re: Find unique elements in an array by kvale (Monsignor) on Apr 04, 2004 at 15:52 UTC
If there is a stack penalty, it is not terrible, as the routine get faster with large arrays: `use Benchmark qw(:all) ; my @a; push @a, int (rand(100)) foreach 1..2_000_000; my %unique; my (@awd1, @awd2, @awd3); cmpthese(5, { 'jc' => sub { foreach my $thingy (@a) { $unique{$thingy} = 1 +; } @awd1 = keys %unique; }, 'mk' => sub { @unique{ @a} = 1; @awd2 = keys %unique; }, 'ys' => sub { @unique{ @a} = (); @awd3 = keys %unique; }, });` [download] yields `Benchmark: timing 5 iterations of jc, mk, ys... jc: 19 wallclock secs (16.75 usr + 0.35 sys = 17.10 CPU) @ 0 +.29/s (n=5) mk: 6 wallclock secs ( 6.00 usr + 0.01 sys = 6.01 CPU) @ 0 +.83/s (n=5) ys: 7 wallclock secs ( 6.00 usr + 0.00 sys = 6.00 CPU) @ 0 +.83/s (n=5) s/iter jc mk ys jc 3.42 -- -65% -65% mk 1.20 185% -- -0% ys 1.20 185% 0% --` [download] The `= ()` optimization does not seem to make much difference. -Mark	[reply] [d/l] [select]
Re: Re: Re: Re: Find unique elements in an array by ysth (Canon) on Apr 04, 2004 at 19:53 UTC
Re: Find unique elements in an array by ccn (Vicar) on Apr 04, 2004 at 16:54 UTC
Re: Re: Find unique elements in an array by kappa (Chaplain) on Apr 04, 2004 at 21:04 UTC
As per ccn's comment below, `@unique{@a} = 1` looks weird and effectively equals to this: `$unique{$a[0]} = 1; undef(@unique{@a[1..$#a]});`, which, BTW, is good, as, according to tilly, storing lots of undefs instead of 1's in a hash to create keys is a Good Thing(tm) for performance.	[reply] [d/l] [select]
Re: Find unique elements in an array by jdporter (Paladin) on Apr 04, 2004 at 17:45 UTC
`--` to you... Not so much for giving an inferior solution (which you did), but because you obviously didn't do the least bit of research to see if this has already been covered. If you had, you probably would have discovered that it is answered in the Perl FAQ: How can I extract just the unique elements of an array?. You might also have found this entry in the Q&A section: How can I display a unique array if it contains some repeated elements. jdporter The 6th Rule of Perl Club is -- There is no Rule #6.	[reply] [d/l]
Re: Find unique elements in an array by dragonchild (Archbishop) on Apr 05, 2004 at 03:14 UTC
My personal unique'ing function is: `sub unique { my %x; @x{@_} = @_; values %x }` [download] The reason to use values is that I need to get the unique set from an array of objects, not scalars. (This assumes, of course, that every object has a unique stringification. As I don't overload stringification in my objects, this works. If you do overload stringification ... good luck. Seriously.) And, the reason for not using references is that I use this function as so: `my @final = map { # Something } unique grep { # Something } map { # Something } @start;` [download] ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose	[reply] [d/l] [select]
Re: Find unique elements in an array by etcshadow (Priest) on Apr 05, 2004 at 02:30 UTC
Not that it is any different in the manner that it works, but I've always preferred: `my @uniquelist = keys %{{map {$_=>1} @nonuniquelist}};` [download] An anonomous hash makes it much more compact. `------------ :Wq Not an editor command: Wq` [download]	[reply] [d/l] [select]
Re: Re: Find unique elements in an array by BrowserUk (Patriarch) on Apr 05, 2004 at 03:09 UTC
It's compact, but far and away the slowest method. However, you can combine elements of yours with tilly's performance trick and get something that is both fairly compact and almost as quick. Read more... (1230 Bytes) Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l]


Perl: the Markov chain saw
	PerlMonks