Re: Removing duplicates in multi-dimensional arrays

in reply to Removing duplicates in multi-dimensional arrays

I'm not sure that particular snippet is going to be directly usable for your purpose (although others may suggest ways to adapt it to do so), but in the interest of education I will try to explain what is going on. For the purpose of this example, assume @rows has the following initial content: [ 'a', 's', 'd', 'd', 'f', ]. (And if anyone notices any errors in the following, please point them out, so I do not lead someone else astray!)

According to the docs for do:

Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by the while or until loop modifier, executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.)

And the docs for grep:

This is similar in spirit to, but not the same as, grep(1) and its relatives. In particular, it is not limited to using regular expressions.
Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value consisting of those elements for which the expression evaluated to true. In scalar context, returns the number of times the expression was true.
my @foo = grep(!/^#/, @bar); # weed out comments
or equivalently,
my @foo = grep {!/^#/} @bar; # weed out comments
Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables. Similarly, grep returns aliases into the original list, much as a for loop's index variable aliases the list elements. That is, modifying an element of a list returned by grep (for example, in a foreach , map or another grep) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

So what does this actually mean? Let's walk through it.

do executes its block
1. The hash %seen is declared as a local (lexical) variable.
2. grep evaluates for $_ = 'a'. As there is no entry for 'a', !$seen{'a'} is !0 which is 1 (true), and 'a' will pass the grep test, but the '++' increments $seen{'a'} to 1.
3. grep evaluates for $_ = 's'. As there is no entry for 's', !$seen{'s'} is !0 which is 1 (true), and 's' will pass the grep test, but the '++' increments $seen{'s'} to 1.
4. grep evaluates for $_ = 'd'. As there is no entry for 'd', !$seen{'d'} is !0 which is 1 (true), and 'd' will pass the grep test, but the '++' increments $seen{'d'} to 1.
5. grep evaluates for $_ = 'd'. $seen{'d'} is 1, so !$seen{'d'} is !1 which is 0 (false), and this instance of 'd' fails the grep test, but the '++' increments $seen{'d'} to 2.
6. grep evaluates for $_ = 'f'. As there is no entry for 'f', !$seen{'f'} is !0 which is 1 (true), and 'f' will pass the grep test, but the '++' increments $seen{'f'} to 1.
@rows is assigned the results of the do (the results of the grep on @rows); that is, [ 'a', 's', 'd', 'f', ]

Hope that helps.

Comment on Re: Removing duplicates in multi-dimensional arrays Select or Download Code

In Section Seekers of Perl Wisdom