Re^3: Most elegant way to dispose of duplicates using map

Replies are listed 'Best First'.
Re^4: Most elegant way to dispose of duplicates using map by johngg (Canon) on Oct 30, 2006 at 22:28 UTC
You just need to make a key for the `%seen` hash that is your id and version joined together in some way. Here I join them with a colon use strict; use warnings; use Data::Dumper; my @partTuples = ( q{abc,1.1,apple}, # 1st element q{def,3.6,orange}, # no dups. so OK q{abc,1.5,pear}, # OK id only dup. q{abc,1.1,kiwi}, # dup. id and version q{ghi,1.2,peach}, # no dups. so OK q{xyz,1.1,plum}, # OK version only dup. ); my %seen = (); my @uniquePTs = grep {! $seen{join q{:}, $_->{id}, $_->{version}} ++} map { { id => $_->[0], version => $_->[1], classification => $_->[2] } } map { [split m{,}] } @partTuples; print Dumper(\@uniquePTs); [download] The output is `$VAR1 = [ { 'version' => '1.1', 'classification' => 'apple', 'id' => 'abc' }, { 'version' => '3.6', 'classification' => 'orange', 'id' => 'def' }, { 'version' => '1.5', 'classification' => 'pear', 'id' => 'abc' }, { 'version' => '1.2', 'classification' => 'peach', 'id' => 'ghi' }, { 'version' => '1.1', 'classification' => 'plum', 'id' => 'xyz' } ];` [download] Cheers, JohnGG	[reply] [d/l] [select]
Re^5: Most elegant way to dispose of duplicates using map by exussum0 (Vicar) on Oct 31, 2006 at 14:43 UTC
I'd suggest using a very careful delimiter if you want to take a key and serialize it in this manner. if for instance, you used a comma, you have a key value pair of "1,2" and "3" and another of "1" and "2,3", they may evaluate the same.	[reply]
Re^6: Most elegant way to dispose of duplicates using map by johngg (Canon) on Oct 31, 2006 at 14:53 UTC
Agreed. I chose a colon in this example as there were none in any of the strings that were going to form the keys. Similarly, a simple concatenation is also potentially dangerous, e.g. "frederick" and "son" vs. "fred" and "erickson". However, I should have stressed the point so thank you for doing it for me. Cheers, JohnGG	[reply]
Re^5: Most elegant way to dispose of duplicates using map by rashley (Scribe) on Oct 31, 2006 at 15:00 UTC
I really need to crack the magic map/grep code. I see what you're doing, and I think a colon will work for the data I'm dealing with, but if I understand this, I'll need to change the way I'm putting my original @partTuples together. This: `@partTuples = map { my @t = split(','); {id=>$t[0], version=>$t[1], classification=>$t[2]} } @partTuples;` [download] Isn't working, since we're doing the mapping later on, but I'm not sure what you're code is expecting. Thanks for all the help.	[reply] [d/l]
Re^6: Most elegant way to dispose of duplicates using map by johngg (Canon) on Oct 31, 2006 at 15:43 UTC
Although not familiar with `$cgi->param()`, from your OP it looked like it returned a list of strings that you assigned to an array, each string being three comma-delimited fields. I just made up some gash data that had the same structure. The code I gave goes from the array of strings though to the array of unique part tuples hashes without stopping along the way. You could even take it further by feeding the return of `$cgi->param()` straight into the `map`s, like this. `my @uniquePTs = grep {! $seen{join q{:}, $_->{id}, $_->{version}} ++} map { { id => $_->[0], version => $_->[1], classification => $_->[2] } } map { [split m{,}] } $cgi->param('partID');` [download] Reading this code from the bottom up you 1) call `$cgi->param()` which returns a list of strings that are passed, one at a time, into the bottom `map` 2) things are passed into and out of `map` and `grep` in `$_` so the bottom map takes the string passed in and splits it on commas. The resultant list is placed inside anonymous array constructors `[ ... ]` so a reference to the new anonymous array is passed out to the `map` above, again in `$_` 3) in the second `map` the value passed in in `$_` is a reference to an array so to use it we need to dereference it like `$_->[0]` etc. In this `map` we construct an anonymous hash using `{ ... }` and populate the key/value pairs. The reference to the hash is in turn passed out to the `grep` 4) in the `grep` we again need to dereference `$_`, this time to access the hash like `$_->{id}`. By combining the values for the "id" and "version" keys we can construct a key for the `%seen` hash that we use to detect duplicates. We `grep` out only those anonymous hashes who's "id" and "version" haven't already occurred in the `%seen` hash. 5) finally, those hash references that have passed the `grep` are assigned to the `@uniquePTs` array as the `grep{...} map{...} map{...} list` returns a list. I hope I've explained this adequately but I'm rushing a bit as I have to leave for an appointment soon. If I've totally misunderstood what `$cgi->param('partID');` does, let me know and I'll adjust the code. Cheers, JohnGG	[reply] [d/l] [select]
Re^6: Most elegant way to dispose of duplicates using map by rashley (Scribe) on Oct 31, 2006 at 15:38 UTC
Oops, nevermind. You already took that into account. Thanks!	[reply]


Think about Loose Coupling
	PerlMonks