http://qs321.pair.com?node_id=1187094

ankitpati has asked for the wisdom of the Perl Monks concerning the following question:

The perldoc on keys clearly states

Called in list context, returns a list consisting of all the keys of the named hash…
but makes no attempt to explain why.

Googling around for an explanation also yielded no results.

Put very simply, why does

sub uniq { my %hash = map { $_ => 1 } @_; return keys %hash; }
work, but
sub uniq { return keys map { $_ => 1 } @_; }
not?


Edit:

I express my sincerest thanks to the monks for a warm welcome into the monastery.

Following the suggestions, I tried

sub uniq { # two pairs of braces around map return keys %{ { map { $_ => 1 } @_ } }; }
which works perfectly fine.

However, prior to this, even before I posted the question, I experimented with

sub uniq { # single pair of braces around map return keys %{ map { $_ => 1 } @_ }; }
and it did not work.

I had always believed that braces can convert a list with even number of elements into a valid HASHref, and modulo (%) can convert any valid HASHref into a hash.

The question now is this: if map just produces a list, why does it need two pairs of braces to emit a valid HASHref?

Replies are listed 'Best First'.
Re: Why does 'keys' need a named hash?
by duelafn (Parson) on Apr 05, 2017 at 12:04 UTC

    It needs a real hash to work on (possibly there is a more technical description of this), but the map just produces a list of values, not a hash.

    You can do it all in one line though:

    use 5.010; sub uniq { return keys %{ {map { $_ => 1 } @_} }; } say for uniq qw/ foo bar foo baz bip foo /;

    Good Day,
        Dean

Re: Why does 'keys' need a named hash?
by huck (Prior) on Apr 05, 2017 at 12:03 UTC

    Because the second map just returns a list and keys has no use for just a list. The => operator is sometimes called the fat comma. Its use does not make something a hash.

      The relevant documentation: Comma Operator in perlop.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Why does 'keys' need a named hash?
by pme (Monsignor) on Apr 05, 2017 at 12:12 UTC
    Hi ankitpati,

    map { $_ => 1 } @_
    returns list having even number of elements. 'keys' works only on hashes. In your working example %hash is initialized with the list returned by 'map' and then 'keys' can do the job on %hash.

    HTH

Re: Why does 'keys' need a named hash?
by BillKSmith (Monsignor) on Apr 05, 2017 at 15:16 UTC
    The document is using the word "named" in the sense of "specified in the calling statement". You can take the keys of an anonymous hash:
    use strict; use warnings; my @Uniq = uniq('a', 'b', 'c', 'a', 'b', 'a'); print "@Uniq\n"; sub uniq { return keys %{{map {$_=>1} @_}}; } OUTPUT: c a b

    Edit: Revised example to be more like the original problem.

    Bill
Re: Why does ‘keys’ need a named hash?
by Anonymous Monk on Apr 05, 2017 at 21:26 UTC

    I read the answers above and feel the issue could be further clarified... The question is about braces and the ways in which they are used in perl.

    To work with hashes, two useful syntaxes should be understood.

    • One is anonymous hash constructor: { @list }
    • The other is hash dereference syntax: %{ $href }
    • You can combine the two to produce a hash from a list, and then, make it list again: %{ { @list } }.
      In the process, duplicate keys are squashed.

    With map, the braces have an ambiguous meaning: they could be a hash constructor, or (more commonly), a BLOCK. To avoid confusion, you might want to rewrite the map expression using parentheses this time: map(( $_ => 1 ), @list).

    Finally, if you intend to use a hash (a data structure that cannot have duplicate keys) to implement the uniq routine, you must at some step construct one! Either anonymous or named one; but for anon case, you'll have the extra braces.

Re: Why does 'keys' need a named hash?
by karthiknix (Sexton) on Apr 05, 2017 at 12:48 UTC

    map function is for traversing the array like a foreach loop to manipulate the data of an array or hash. Hence keys keyword will not have any effect on map function. Referencing the results of map function would ensure keys keywords identify its next statement is a hash.

Re: Why does 'keys' need a named hash?
by vrk (Chaplain) on Apr 05, 2017 at 14:02 UTC

    Actually, if you squint, you would indeed expect keys to return all elements with even index in a list. Similarly, values on a list should return all odd elements. That's how you make a hash: even elements become keys, odd elements become values. I can think of a couple uses cases for this DWIM behaviour that don't involve hashes...

      Erm, no, I wouldn't expect that. A hash isn't just an even-sized list. It uses a hashing algorithm (hence the name "hash") to divide the data up into a series of arbitrarily-ordered buckets. The resulting data structure is almost completely unlike a list. Why would functions designed to work on such a structure also work on lists?

      The only way I can squint hard enough at it to make keys work on a list would be if keys operated by flattening the hash into a list and then iterating over every other list element. And I can't for the life of me come up with a good reason for making keys work that way.

      If there really are that many use cases for iterating every other element of a list, they would be much better served by adding a new keyword for that functionality than by bloating keys with an unnecessary and anti-performant hash-to-list conversion, particularly since the new keyword could be generalized to iterate over every Nth element of the list instead of only giving "even elements" and "odd elements" options.

      That breaks down if your list has duplicate "keys".

      I just noticed there are functions for this in List::Util: pairkeys and pairvalues. For example:

      use v5.14; use List::Util qw(pairkeys pairvalues); my @a = qw(a b c d e f); say "keys = ", join ", ", pairkeys @a; say "values = ", join ", ", pairvalues @a; my %h = @a; say "keys = ", join ", ", keys %h; say "values = ", join ", ", values %h;

      The obvious difference to keys and values of a hash is that pairkeys doesn't remove duplicates and it keeps the original array order.