Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

O, the horrors of references and complex data structures

by DeusVult (Scribe)
on Feb 06, 2001 at 01:28 UTC ( [id://56508]=perlquestion: print w/replies, xml ) Need Help??

DeusVult has asked for the wisdom of the Perl Monks concerning the following question:

I have an array of hashes. Well, technically it is an array of hash references, which AFAIK is the only way to construct an array of hashes in Perl. This array is declared globally. Let us call this array AoH.

I also have a globally declared hash, let us call it globHash.

My script jumps around from subroutine to subroutine. Each subroutine makes a few changes to globHash and then calls one or more (usually more) other subroutines which do the same. That is the reason that globHash is declared globally, because it is constantly being changed, re-changed, and changed again.

Eventually, I come to the end of one chain of subroutine calls. At the end of this chain, I push globHash onto AoH like this

push @AoH, \%globHash;
Then, through the magic of the run-time stack, I back up to the beginning of this chain, and follow a different one until that chain, too, comes to an end, and I push the hash onto the array again. Lather, rinse, repeat as necessary (and it is necessary to do so many times).

When all is said and done, I find that I have only succeeded in pushing dozens of identical hashes onto the array. All of the array entries, no matter when they were pushed, have the same values as the last hash I pushed onto the array.

Obviously, references are more like pointers than I had imagined. Now that this has happened, I am not surprised that it has done so (how impressive of me, predicting events that have already occurred). The question remains, now that I know this happens, how do I stop it? If I was righting this in C++, I would do

hash *globHash; array<hash>AoH; // do stuff ... //much later AoH.push( globHash ); delete globHash; globHash = new *hash; // now I have a fresh new hash!
So my question to you, O monks of wondrous knowledge, is this: is there some sort of delete/new equivalent in Perl (and I know that OO constructors are sometimes named new, but that isn't even remotely like what I'm talking about)? Can I get the hash to "reset" itself so that, when I take its reference a second time, it will be a different reference from when I took it the first time?

Some people drink from the fountain of knowledge, others just gargle.

Replies are listed 'Best First'.
Re: O, the horrors of references and complex data structures
by Fastolfe (Vicar) on Feb 06, 2001 at 01:32 UTC
    You probably want to use the { curly-brace hashref constructor } to do this job:
    push(@AoH, { %globHash });
    This essentially copies the %globHash's values and stores a reference to that in @AoH. Similarly, arrays can be copied by using [ @array ].

    NOTE: If your hash is several levels deep, this only copies the top-level! The references stored within a hash are still copied as-is, meaning some of your changes may still be replicated across each instance. To get a full "deep copy" effect, use the 'dclone' function in Storable, or the Clone module.

Re: O, the horrors of references and complex data structures
by runrig (Abbot) on Feb 06, 2001 at 01:45 UTC
    As an alternative to copying the hash every time, I'd prefer to ditch the global variable and pass a hash reference to all the subroutines (depending on the size of the hash, it could be worthwhile efficiency-wise), something like:
    : while($not_done) { my %hash; sub1(\%hash); sub2(\%hash); push @AoH, \%hash; } sub sub1 { my $href=shift; $href->{key}="value"; } sub sub2 { my $href = shift; # Mess with href some more }
Re (tilly) 1: O, the horrors of references and complex data structures
by tilly (Archbishop) on Feb 06, 2001 at 01:44 UTC
    Others have already told you the clean way to do this (using anonymous hashes), but when you enter the first function you can first do
    local *globHash;
    and leave all else the same. This replaces the global typeglob with a new temporary replacement, your sub runs, pushes the ref the array, and then returns, replacing the original global. You then run again and get a new typeglob, etc.

    Not the best way, but it works.

Re: O, the horrors of references and complex data structures
by chipmunk (Parson) on Feb 06, 2001 at 01:38 UTC
    The easiest way to solve this problem is to create an anonymous hash and push it onto the array: push @AoH, { %globHash }; The other way to solve this problem is to push a reference to a lexical that goes out of scope. For example:
    while (<>) { my %hash = split; push @AoH, \%hash; }
    Each iteration of the loop will create a new instance of %hash, while the old instances will be accessible through the references in @AoH.
Re: O, the horrors of references and complex data structures
by goldclaw (Scribe) on Feb 06, 2001 at 03:04 UTC
    OK, I won't actualy recomend this, as its higly magical, and it is not very intuitve what you are doing.(Id hate to maintain code like this....)

    I'll also call you globalHash globalHash if you dont mind. You'll understand why...

    Anyway, the globalHash is stored in something called a glob. Now a glob is sort of a hash, with the values beeing references to everything that can be called globalHash. So the %globalHash is actually stored as a reference to a hash in the *globalHash glob. (There is also a refernce to a list, in case you define @globalHash. Since you probably haven't that refernce is probably undef in your program) Now, to set %globalHash to a new hash, simply set the glob to an empty hash reference like this:

    push %AoH, \%globalHash; *globalHash={};
    That sets the hash part of the glob to a new, empty hash reference, so that %globalHash is now empty. The good thing is that a glob works a lot like a hash table, so the old hash you had stored there is not gone. There are still references to it in your AoH list. Its just not possible to get at it through the globalHash anymore. Magic, huh...

    Anyway, by assigning a scalar reference(number, string etc) to the *globalHash glob, you will change the scalar part in the glob. Assigning a list reference will change the list part of it. Note also that this won't work for variables you have declared with my as those don't use glob's.

    goldclaw

      Note also that this won't work for variables you have declared with my as those don't use globs.

      Forgive me if I'm being stupid, but when I fail to declare a variable with my, I get a compile error and my script won't run. So this technique must be even more magical than you let on :)

      Some people drink from the fountain of knowledge, others just gargle.

        Well, my isn't the only way to declare a variable and stop use strict 'vars' from complaining. Other ways include use vars, local and using fully qualified variable names (e.g. $main::foo).

        In general, the difference between package variables (that live in a typeglob) and lexical variables (which don't) is a very deep magic that lies at the heart of a thorough knowledge of Perl. I recommend a close study of the Variables section of the Camel book (3ed).

        --
        <http://www.dave.org.uk>

        "Perl makes the fun jobs fun
        and the boring jobs bearable" - me

Re: O, the horrors of references and complex data structures
by arturo (Vicar) on Feb 06, 2001 at 01:37 UTC

    Let me also confirm your suspicion: the problem is indeed that what you're doing is storing a bunch of references to the same hash, so when you change that hash, you change the values you see when you deference all the references to it. Fastolfe's giving you a way of storing 'snaspshots' of the global hash on each run through.

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

Re: O, the horrors of references and complex data structures
by CiceroLove (Monk) on Feb 06, 2001 at 07:04 UTC
    I know I am new at all of this but couldn't you just undef(globHash); before you start your next iteration? CiceroLove
      Nop, that undef's the hash, so the reference he keep in AoH points to a now undef'ed hash. He somehow need to disasociate the %globalHash name with the actual hash. Basically two ways of doing that if he wants to keep %globalHash truly global; using local or messing around with glob's

      sunw287 ~ > perl -e '%globalHash=(one=>"two"); $ref=\%globalHash; unde +f(%globalHash); print join("\n",keys %{$ref})' sunw287 ~ >

      goldclaw

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://56508]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-19 22:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found