Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Referencing localized variables, and typeglobs

by haukex (Archbishop)
on Aug 30, 2017 at 17:56 UTC ( [id://1198343]=perlquestion: print w/replies, xml ) Need Help??

haukex has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

So there's this strange thing you can do:

our %h; my $x = do { local %h=(a=>'3'); \%h }; my $y = do { local %h=(b=>'5'); \%h }; dd \%h; # {} dd $x; # { a => 3 } dd $y; # { b => 5 }

While I wouldn't actually do this, I couldn't seem to find any documentation of it, and I'm wondering if it's "safe" to do, or it is some kind of dark magic that should be avoided. (Also I've only tested it on 5.18 and 5.26 so far.)

The second part of my question is that the Symbol module's gensym claims to return a reference to an "anonymous" glob. But it turns out they're not actually anonymous:

use Symbol qw/gensym/; my $foo = gensym; *$foo->{bar} = 'quz'; dd \%Symbol::GEN0; # { bar => "quz" }

So I'm wondering why Symbol::gensym isn't just implemented like this, which I got from BrowserUk's post here? It seems to me that, while the globrefs are still associated with a name, they really do seem to be more "anonymous".

sub gensym { \do{ local *ANONGLOB; *ANONGLOB } }

In the thread I linked to, tye mentions that unique names make debugging a bit easier, but I am wondering if there are any other technical downsides to the above?

Thanks,
-- Hauke D

Update 2017-09-05: So Tie::StdHandle has been doing something like this for about 18 years now: my $fh = \do { local *HANDLE };

Replies are listed 'Best First'.
Re: Referencing localized variables, and typeglobs
by AnomalousMonk (Archbishop) on Aug 30, 2017 at 23:27 UTC

    With reference only to the first part of your OPed question, I agree with Laurent_R's explanations here and here.

    Here's another way to think about the issue that might offer some confidence and comfort, for all it's long-windedness. (I don't doubt that you understand all this perfectly well, but perhaps someone else may benefit from this blatheration.) With reference to the OPed code:

    • A simple
          do { local %h; }
      expression saves the original content of the package variable referenced by the  %h symbol and default-initializes a new storage location with the same symbolic name within the scope of the do expression. This new storage location is entirely separate from and independent of the storage location originally associated with the  %h symbol prior to the execution of the local statement in the do-expression. At the end-of-scope of the do-expression, any new storage space allocated within the scope of the do-expression is released (because there is no further reference to it) and the  %h symbol reverts to its old, saved symbolic referent.
    •     do { local %h = (a => '3'); }
      does the same and then explictly initializes the new storage location. Again, the newly initialized storage is released at the end-of-scope of the do-expression because there is no longer any reference to it.
    •     do { local %h = (a => '3');  \%h; }
      further takes the reference address of the new (and newly initialized) storage location. The do-expression now evaluates to this reference address, but if nothing more is done with the reference address, nothing can ever access the new storage created within the do-expression and this storage is released and (eventually) garbage-collected.
    • The complete statement
          my $x = do { local %h = (a => '3');  \%h; };
      does save the reference address produced by evaluation of the do-expression, so the storage just created within the scope of the do-expression cannot be released. (However, the  %h symbol still "snaps back" at the end-of-scope of the do-expression.) The reference address produced by evaluation of the do-expression survives, and the storage space associated with it will only be released once all copies of this reference to it (e.g., the  $x variable) have passed out of scope.
    Looking at the differing reference addresses of the various storage locations created by your original code example may also be convincing: (Note that this example was run under Perl 5.8.9.)

    ... I'm wondering if it's "safe" ...

    Yes, Dr. Szell, it's perfectly safe.


    Give a man a fish:  <%-{-{-{-<

Re: Referencing localized variables, and typeglobs
by Laurent_R (Canon) on Aug 30, 2017 at 20:13 UTC
    Concerning your first question, the output you get is exactly what I was expecting when reading the code. While I can't really think of any use for such a strange construct, clearly the local keyword modifies the hash to be "local" to the enclosing do block. So that the code populates each time a local copy of %h and retrieves a reference to this local copy of the hash and assigns this reference to $x and $y. As soon as the code leaves the do block, the value of the hash is reset to its original empty value.

    I may be wrong, but I can't see why it wouldn't be safe to do that. And I don't even think it is magic.

    Or did I perhaps miss something about your concerns?

      Thanks for the reply! My main concern is that I haven't yet found a description of the behavior in the perldocs or (so far) elsewhere.

      So while you are right that the behavior does seem logical, there is another possible interpretation of the code: One might expect that after the effects of the local are over, $x and $y could refer back to the original %h, instead of some anonymous hashes. I'm not saying this interpretation is better or worse than the actual behavior, just that it'd be nice if it were documented.

      Another concern is that perlsub says "This operator works by saving the current values of those variables in its argument list on a hidden stack and restoring them upon exiting the block, subroutine, or eval." - So the values are saved on a stack and presumably the temporary value is popped back off when the scope exits. Without some reassurance from the docs, one might worry that the temporary variables might somehow "go away" when the scope exits (e.g. is this stack refcounted?) and a reference to such a value might become invalid (however unreasonable or not the worry might be).

        One might expect that after the effects of the local are over, $x and $y could refer back to the original %h,
        Why should it? $x and $y are assigned to the value returned by the do block. And this value happens to be an anonymous hash ref produced within the block. And $x does not know anything about the %h hash. The fact that %h is restored to an empty hash immediately thereafter is irrelevant to the value acquired by $x at the time of the assignment.

        Well, I understand your concern, but I do not think there is any reason to worry here. I think the behavior is quite clear.

Re: Referencing localized variables, and typeglobs
by kcott (Archbishop) on Aug 31, 2017 at 05:22 UTC

    G'day haukex,

    What you've described here, with respect to local, is pretty much how I've always understood this to work: no dark arts or black magic; effectively, just a simple pushing and popping of a stack.

    In a subsequent reply, you referenced "perlsub: Temporary Values via local()" and then wrote: "Without some reassurance from the docs, one might worry that the temporary variables might somehow "go away" when the scope exits ...". The following code tracks %h through a do block (similar to your code); in addition, do contains an anonymous block which creates new temporary values. It also tracks the lexical $r: declared but uninitialised before do; assigned a value within the anonymous block; and shown to retain that value after do.

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my $r; print "\$r UNDEF\t"; dd $r; our %h = (z => 26); print "%h BEFORE do\t"; dd \%h; my $x = do { print "PRE local\t"; dd \%h; local %h; print "POST local\t"; dd \%h; %h = (a => 1); print "POST assign\t"; dd \%h; print "%h BEFORE anon\t"; dd \%h; { print "PRE local2\t"; dd \%h; local %h; print "POST local2\t"; dd \%h; %h = (b => 2); print "POST assign2\t"; dd \%h; $r = \%h; print "\$r ASSIGNED\t"; dd $r; } print "%h AFTER anon\t"; dd \%h; \%h; }; print "%h AFTER do\t"; dd \%h; print "\$x AFTER do\t"; dd $x; print "\$r FINAL\t"; dd $r;

    Output:

    $r UNDEF undef %h BEFORE do { z => 26 } PRE local { z => 26 } POST local {} POST assign { a => 1 } %h BEFORE anon { a => 1 } PRE local2 { a => 1 } POST local2 {} POST assign2 { b => 2 } $r ASSIGNED { b => 2 } %h AFTER anon { a => 1 } %h AFTER do { z => 26 } $x AFTER do { a => 1 } $r FINAL { b => 2 }

    Hopefully, that's a bit more reassuring. :-)

    — Ken

Re: Referencing localized variables, and typeglobs
by BrowserUk (Patriarch) on Aug 30, 2017 at 20:29 UTC

    FWIW. I went on to use my version of gensym very extensively in some heavily threaded code and it never gave me any problems.

    I don't have access to that code any more, but I might find something if I dig around on my archive CDs.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit

      Thanks and don't worry too much about digging it out, knowing that it was used successfully is a good start! :-)

Re: Referencing localized variables, and typeglobs
by haukex (Archbishop) on Aug 31, 2017 at 16:57 UTC

    Thank you Laurent, BrowserUk, AnomalousMonk, and Ken for the explanations and reassurances :-)

    It does certainly make sense that the memory allocated for the localized data structures is subject to Perl's usual memory management, that is basically what I was hoping for, but couldn't immediately confirm, so thank you for your explanations. Here's another quick test I did:

    use Devel::Refcount qw/refcount/; my $x; our %h = (one=>'two'); { local %h = (three=>'four'); $x = \%h; print "$x ", refcount($x), "\n"; } print "$x ", refcount($x), "\n"; __END__ HASH(0x2fb246a) 2 HASH(0x2fb246a) 1

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1198343]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-20 02:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found