Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Arrow Operator Question

by sectokia (Pilgrim)
on Mar 26, 2023 at 08:59 UTC ( [id://11151225]=perlquestion: print w/replies, xml ) Need Help??

sectokia has asked for the wisdom of the Perl Monks concerning the following question:

I thought I knew perl, and then this tripped me up:

use Data::Dumper qw(Dumper); my $a = { b => {} }; my $f = $a->{b}{c}{d}//undef; print Dumper $a
result:
$VAR1 = { 'b' => { 'c' => {} } };

Why does 'c' key get created during what is meant to be only an assignment operation? And under that logic, why isn't 'd' made a key of 'c' with value undef?

Replies are listed 'Best First'.
Re: Arrow Operator Question
by swl (Parson) on Mar 26, 2023 at 09:17 UTC
Re: Arrow Operator Question
by kcott (Archbishop) on Mar 26, 2023 at 11:45 UTC

    G'day sectokia,

    [Note: $a (and $b) are special variables; it's best to only use these for their intended purposes to avoid unexpected side effects; I've replaced $a with $h in the code below. Also, '//undef' is completely superfluous: if the LHS is undef, use undef instead. :-)]

    See Autovivification and Arrow Notation in perlref.

    Perl will autovivify as needed. It requires 'c' to check for 'd', so that key is autovivified. The 'd' key either exists and has a value or it doesn't exist and the value is undef: no autovivification is needed here.

    You can use exists() to check for 'c'; only attempting to get a value for 'd' if 'c' exists. That can become unwieldy when there are multiple levels of keys; if you think autovivification is a problem, simply allow it then delete() afterwards.

    The following code has examples which demonstrate these points.

    $ perl -E ' use Data::Dump; { say "*** Autovivification"; my $h = { b => {} }; dd $h; my $f = $h->{b}{c}; say $f // "undefined"; dd $h; my $g = $h->{b}{c}{d}; say $g // "undefined"; dd $h; } { say "*** No autovivification"; my $h = { b => {} }; dd $h; my $f = $h->{b}{c}; say $f // "undefined"; dd $h; my $g = exists $h->{b}{c} ? $h->{b}{c}{d} : undef; say $g // "undefined"; dd $h; } { say "*** Autovivify then delete"; my $h = { b => {} }; dd $h; my $f = $h->{b}{c}{x}{y}{z}; say $f // "undefined"; dd $h; delete $h->{b}{c}; dd $h; } '

    Output:

    *** Autovivification { b => {} } undefined { b => {} } undefined { b => { c => {} } } *** No autovivification { b => {} } undefined { b => {} } undefined { b => {} } *** Autovivify then delete { b => {} } undefined { b => { c => { x => { y => {} } } } } { b => {} }

    — Ken

      I prefer to implement kcott's idea this way. "Short circuit" operation of the and operator C style Logical And prevents unnecessary testing which would cause autovivification.
      use strict; use warnings; use Data::Dumper; use Test::More tests=>2; my $h = { b => {} }; my $f = $h->{b}{c}{d} if exists($h->{b}{c}) and exists($h->{b}{c}{d}); is_deeply( $h, { b => {} }, 'no autovivification' ); is( $f, undef, 'no value assigned' );

      OUTPUT:

      1..2 ok 1 - no autovivification ok 2 - no value assigned
      Bill
        my $f = $h->{b}{c}{d} if exists($h->{b}{c}) and exists($h->{b}{c}{d});

        This is dangerous: the nature of my $var = ... if ... is predictable but very noninuitive (it acts something like a state variable). It has long been regarded as a bug that perl core would dearly love to fix, but cannot due to back-compat.

        I recommend always either splitting out the declaration from the assignment:

        my $f; $f = $h->{b}{c}{d} if exists($h->{b}{c}) and exists($h->{b}{c}{d});

        .. or using state explicitly when that's what you actually intend:

        use feature 'state'; # 'use 5.10' or later also gives this automatica +lly state $f = $h->{b}{c}{d} if exists($h->{b}{c}) and exists($h->{b}{c}{d +});

        G'day Bill,

        That's certainly a valid way to go. Consider the following:

        $ perl -e ' use strict; use warnings; use Test::More tests => 8; my $h = { b => {} }; my ($f, $f2, $f3, $f4); $f = $h->{b}{c}{d} if exists($h->{b}{c}) and exists($h->{b}{c}{d}) +; is_deeply( $h, { b => {} }, "no autovivification" ); is( $f, undef, "no value assigned" ); $f2 = $h->{b}{c}{d} if exists $h->{b}{c}; is_deeply $h, { b => {} }, "f2: no autovivification"; is $f2, undef, "f2: no value assigned"; $f3 = $h->{b}{c}{d}{x}{y}{z} if exists $h->{b}{c} and exists $h->{b}{c}{d} and exists $h->{b}{c}{d}{x} and exists $h->{b}{c}{d}{x}{y}; is_deeply $h, { b => {} }, "f3: no autovivification"; is $f3, undef, "f3: no value assigned"; $f4 = $h->{b}{c}{d}{x}{y}{z}; delete $h->{b}{c}; is_deeply $h, { b => {} }, "f4: autovivification removed"; is $f4, undef, "f4: no value assigned"; '

        Update: The code above is a modification of the original. Something was niggling me about what I first wrote, but I couldn't see the problem. ++hv's response to your post alerted me to the issue. My first, less-than-good effort is in the spoiler below. Note that the output is unchanged.

        Output:

        1..8 ok 1 - no autovivification ok 2 - no value assigned ok 3 - f2: no autovivification ok 4 - f2: no value assigned ok 5 - f3: no autovivification ok 6 - f3: no value assigned ok 7 - f4: autovivification removed ok 8 - f4: no value assigned
        • Short-circuiting is not actually needed in your example (see f2)
        • Short-circuiting works but can be unwieldy with more complex data structures (see f3)
        • Allowing autovivification then removing it results in cleaner, and easier to maintain, code (see f4)
          • This option has potential drawbacks. Consider the case where $h->{b}{c}{q} existed and had a meaningful and required value but is removed by 'delete $h->{b}{c}'.
          • On second thoughts: this is probably a very bad idea (even though it does work in this specific test script).

        I think each has its merits and probably comes down to best choice on a case by case basis.

        — Ken

      Thanks. The reason I had undef was the style I generally use is more like this:

      if ($foo == ($bar->{c}{d}{e}//'moo')) { ... }
        "if ($foo == ($bar->{c}{d}{e}//'moo')) { ... }"

        You may have typed that in a hurry. While I do follow the gist of what you're saying, I do hope you're aware that '==' is used for numbers (e.g. 300) and 'eq' is used for strings (e.g. "moo"). See "perlop: Equality Operators" for more complete details.

        — Ken

Re: Arrow Operator Question
by tobyink (Canon) on Mar 26, 2023 at 11:54 UTC

    As others have said, it's autovivification.

    As a quick way to explain why the 'c' key gets created, but not 'd' though: $a->{b}{c}{d} is looking for a 'd' key within $a->{b}{c}, so you are asserting that $a->{b}{c} must be a hashref for your program to succeed at all. So Perl makes it a hashref for you. But you're not asserting anything in particular about the nature of $a->{b}{c}{d}, so Perl doesn't make it anything.

Re: Arrow Operator Question (Autovivification motivated)
by LanX (Saint) on Mar 26, 2023 at 13:12 UTC
    Others already pointed you to the "how", namely Perl's autovivification

    But it seems nobody explained "why".

    It's DWIM.

    If you try the same thing in JS you get an error.

    in the browser's console (F12)

    > a = { b : {} }; > f = a["b"]["c"]["d"]["e"] Uncaught TypeError: Cannot read properties of undefined (reading 'd' +) at <anonymous>:1:16 > f = a.b.c.d.e Uncaught TypeError: Cannot read properties of undefined (reading 'd' +) at <anonymous>:1:11

    Assigning to a deeply nested path is even more a PITA in those languages

    > a["b"]["c"]["d"]["e"] ="X" Uncaught TypeError: Cannot read properties of undefined (reading 'd' +) at <anonymous>:1:12 > a.b.c.d.e = "X" Uncaught TypeError: Cannot read properties of undefined (reading 'd') at <anonymous>:1:7

    You'll need a loop to build each level step by step.

    But Perl does DWIM by creating accessed levels on the fly.

    > perl -de0 ... DB<1> $a->{b}{c}{d}="X" DB<2> x $a 0 HASH(0x32445c8) 'b' => HASH(0x3244220) 'c' => HASH(0x3244130) 'd' => 'X' DB<3>

    Many paint this as bug, because it's different to other languages.

    I disagree, it's a case where you can't make an omelet without breaking an egg somewhere.

    And I profit from this far more often than I stumble over it.

    It should be be better explained and motivated though.

    meta

    There is no autovivification to disable this on demand

    I personally think a syntactic solution with a no-autovivification operator might have been better.

    Alas there are not many characters left in the alphabet, and I'm not sure such a syntax could work:

    $a->{b}:{c}{d}  # should not vivify beneath b

    FWIW there is also Data::Diver (and other similar modules) on CPAN, which could also be core.

    update

    BTW: Contrary to the title, you don't need the arrow operator to see this effect.

    update

    I'd love to see other or even better motivations for "Autovivification" discussed.

    Cheers Rolf
    (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11151225]
Approved by marto
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-25 18:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found