Re: Referencing in advanced data structures

in reply to Referencing in advanced data structures

Note: This ended up a whole lot longer than I intended. Sorry. But I wanted to walk it step by step to be sure it was clear (or at least firmly muddy). The short version (if you don't want to read all the 'why') is that the 'shorter' form, without the extra ${}'s, actually turns into something slightly different inside perl. But I typed all the why, so please at least pretend to read it :)

It's a question of what happens implicitly. Let's walk the whole process.

For clarity, consider $struct defined in a single block:

my $struct = {
    hashRef => { fruit => 'apple', veggie => 'corn' },
    arrayRef => [ qw(1 2 3 4 5) ],
};
[download]

So, if you look at $struct, that's a hash reference (to a hash with two keys, 'hashRef' and 'arrayRef'). %$struct is the dereferenced hash; %{$struct} is the same thing, but you don't need the {}'s in this case. So those two represent a hash (not a reference), with the two keys above.

$struct->{hashRef} is a hash reference (to a hash with two keys, 'fruit' and 'veggie'). You can't do $struct{hashRef}, because that's looking for the key 'hashRef' in the hash %struct, not the hash reference $struct.

Now, that's all (hopefully) clear. Here's where it gets trickier. We know that $struct->{hashRef} is a hash reference too (just like $struct) is. So to look up a key inside that, we have to dereference it, and we get $struct->{hashRef}->{fruit}.

But, wait a minute. You wrote $struct->{hashRef}{fruit}, not $struct->{hashRef}->{fruit}. What happened to the extra ->? The answer is that perl puts it there implicitly, between {} hash subscripts or [] array subscripts. Consider the array case; $struct->{arrayRef}[2] and $struct->{arrayRef}->[2] both do the same thing, because in the former case perl implicitly puts a -> in, because it knows you're going through a reference rather than a (hash|array).

It doesn't do this on the first subscript because it's not necessarily clear what you're trying to subscript. It's notable that earlier version of perl didn't implicitly add the ->'s; I don't remember when they changed, but I do have some lumps of existing code that have them all explicit because it was needed back then. It was a pretty long time ago.

So, anyway, digression aside: You've written $struct->{hashRef}{fruit}, which is internally translated to $struct->{hashRef}->{fruit}. But you've also written print ${$struct->{hashRef}}{fruit}. Now, this is different.

In the former case without the extra ${}, you're trying to get the {fruit} subscript of a hash reference, but in a way that perl can know to add the -> and dereference for you. In the latter case, however, you've already dereferenced it yourself with the ${} So you have a hash (not a reference) that you're trying to take the subscript of, and it Just Works, without needing to implicitly add the ->.

Or in a shorter form, the explicit statement $struct->{hashRef} is equivalent to ${$struct}{hashRef}. In the former case, you're dereferencing $struct via the ->, and in the latter via the ${}. Going to the next level, $struct->{hashRef}->{fruit} and ${$struct->{hashRef}}{fruit} are equivalent in the same way. $struct->{hashRef}{fruit} is also equivalent, because perl internally adds the -> and makes it into $struct->{hashRef}->{fruit}.

Aside: You'd actually use %{}, not ${}, to turn a hash ref into a hash, like we did up in the first paragraph after defining $struct. However, all these cases use ${} instead because the end result we're trying for (the value of that {fruit} subscript) is a scalar, so that scalar-ness propogates up the sigils and we end up with ${}. The sigil represents the end result of the process, not what we're doing in this piece of it. That still trips me up sometimes :)

In fact, you could eliminate the first -> totally too, just by adding extra levels of ${}; you end up with something like ${${$struct}{hashRef}}{fruit}, which is much less readable. That's why we have -> in the first place. This comes into perl via C, where you use -> to access members of a structure via a pointer (which, if you don't know C, is basically the same as a perl reference). You access structure members with ., so using the terms from the perl hash above, you'd get something like struct.hashRef if struct were a structure (hash). But with struct being a pointer to the structure (reference to the hash), you'd have to dereference it first like (*struct).hashRef, which is clumsy, so C lets you struct->hashRef. So that's why perl does it that way.

In case you're still reading and haven't dozed off or died of old age yet, here's a variant of your script with comments quickly suggesting what's happening:

#!/usr/bin/env perl5
use strict;
use warnings;

my $struct = {
    hashRef => { fruit => 'apple', veggie => 'corn' },
    arrayRef => [ qw(1 2 3 4 5) ],
};

# These two are the same; perl adds the implicit '->' to the first and
# turns it into the second internally
print $struct->{hashRef}{fruit}, "\n";
print $struct->{hashRef}->{fruit}, "\n";

# Now we'll deref the $struct->{hashRef} hash reference via ${} instea
+d
# of ->
print ${$struct->{hashRef}}{fruit}, "\n";

# And for the coup de grace, we'll deref $struct via ${} instead of ->
# too.
print ${${$struct}{hashRef}}{fruit}, "\n";
[download]

Output:

% ./tst.pl 
apple
apple
apple
apple
[download]

In Section Seekers of Perl Wisdom