http://qs321.pair.com?node_id=228543

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a recursive routine that does destructive processing on a passed ref to an AoHoA... but as I return from each level of recursion, I need to 'undo' the changes made.

Basically, I need to provide a deepcopy of (or some subtree of) the structure as I recurse in so that I retain an unchanged copy as I come back up the tree.

I did a search on CPAN for DeepCopy and came up empty apart from Data:Dumper which of course does a somewhat different thing.

The best I've come up with is:

sub deepcopy{ return $_[0] unless ref $_[0]; return [ map{ deepcopy($_) } @{$_[0]} ] if ref $_[0] eq 'ARRAY' +; return { map{ deepcopy($_) } %{$_[0]} } if ref $_[0] eq 'HASH'; return $_[0]; }

but I far from convinced that this hasn't been written before; or that this isn't full of wholes.

Note: I know it doesn't handle coderefs but that doesn't figure in this particular application.

Is there something better?


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re: Deepcopy of complex structures.
by dash2 (Hermit) on Jan 21, 2003 at 00:46 UTC
    use Storable qw/clone/;

    Storable is in the standard distribution now, and clone does what you want. I don't think it handles CODEREFS though.

    dave hj~

(jeffa) Re: Deepcopy of complex structures.
by jeffa (Bishop) on Jan 21, 2003 at 01:52 UTC

    How about Clone?

    UPDATE: some code
    use strict; use warnings; use Clone qw(clone); use Data::Dumper; my $str; $str .= "{ $_ => " for 'a'..'z'; $str .= '}' x 26; my $orig = { nested_hash => eval $str, code_ref => sub { my $foo = 'bar'; sub{$foo} }, }; my $clone = clone($orig); print Dumper $clone->{nested_hash}; print $clone->{code_ref}->()->(), $/;

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      Unfortunately it uses XS. Despite nearly a week of trying, I have yet to succeed in building Perl myself, and PPM/PPM3 both stopped working (with no indication as to why) some time ago.:(

      If I could work out were AS keep the modules (as opposed to the PPD's) I might have a go at manual install... I've been successful with other modules from Dada's site amongst others, but googling didn't turn up the right stuff. I will have to hack my way through the PPM source to work out where PPM looks for the files:(


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

        http://ppm.activestate.com/PPMPackages/zips/6xx-builds-only/Clone.zip seems to have the stuff to install on Windows... but no makefile.

        I think all PPM does is copy these files to the right place, so maybe you can download the zip file and do the same thing my (working) PPM 2 did:

        C:\>ppm PPM interactive shell (2.1.5) - type 'help' for available commands. PPM> install Clone Install package 'Clone?' (y/N): y Installing package 'Clone'... Bytes transferred: 5252 Installing D:\Perl\site\lib\auto\Clone\Clone.bs Installing D:\Perl\site\lib\auto\Clone\Clone.dll Installing D:\Perl\site\lib\auto\Clone\Clone.exp Installing D:\Perl\site\lib\auto\Clone\Clone.lib Installing D:\Perl\site\lib\Clone.pm Installing D:\Perl\site\lib\auto\Clone\autosplit.ix Writing D:\Perl\site\lib\auto\Clone\.packlist
        Note: I've never actually tried this.

        Finding out where the libs/files are is just a case of doing:
        perl -V
        isn't it? Or did I misunderstand something? (Have no space for AS perl on my Windows machines..)

        C.

•Re: Deepcopy of complex structures.
by merlyn (Sage) on Jan 21, 2003 at 04:59 UTC
Re: Deepcopy of complex structures.
by ihb (Deacon) on Jan 21, 2003 at 01:01 UTC

    What's wrong with Data::Dumper? Data::Dumper even has switches for the explicit reason of eval()ing and deep-copying.

    The biggest flaw in your routine, as far as I can see, is that it doesn't handle circular references. It'll recurse to death upon

    my $aref = []; $aref->[0] = $aref; deepcopy($aref);

    Hope I've helped,
    ihb

      My main problem with Data::Dumper--beyond the fact that it regularly blows my swap space on relatively small data structures--is that whilst the stringification/ evaling process makes sense if you need to transmit the structure between system, or store it to disk, all I want is an in-memory copy. Preferably as fast as possible, given this is a recursive routine that may be called to process large volumes of structure.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Deepcopy of complex structures.
by grinder (Bishop) on Jan 21, 2003 at 08:30 UTC
    I'm writing a recursive routine that does destructive processing on a passed ref to an AoHoA... but as I return from each level of recursion, I need to 'undo' the changes made.

    Basically, I need to provide a deepcopy of (or some subtree of) the structure as I recurse in so that I retain an unchanged copy as I come back up the tree.

    It sounds to me as if you are all barking up the wrong tree. From the definition of your problem, it sounds like you just want to use local, to localise a copy of the data structure you are working on. When you leave the recursed scope, the data structure will resume its previously stored values, without any effort on your part.

    If you need to make permanent modifications from time to time then you will have to arrange a signalling mechanism between levels, so that the parent receives the replacement of the child subtree, and does the replacement.


    update for aragorn: you are free to initialise a localised variable to the value of the variable it is localising, which gets rid of that problem.

    update for BrowserUk: Yes, indeed locality doesn't propagate, but can't you just localise as you go, or does that make the code too messy? Can you show an example of the data structure?

    #! /usr/bin/perl -w <code> use strict; use vars qw/ $r $r2 /; $r = { foo => [ { one => 1, two => 2, three => 3 }, { four => 4, five => 5, six => 6 }, ], }; { print "orig : $r->{foo}[1]{five}\n"; $r2 = $r->{foo}; local $r2->[1]{five} = $r2->[1]{five} + 500; print "then : $r->{foo}[1]{five}\n\n"; } print "after: $r->{foo}[1]{five}\n";

    Hope this helps, i.e. what I am trying to get at is that it is always nice to try and get the language to do the dirty work for you, rather than writing code in the language to achieve the same functionality.


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

      Indeed, for simple scalars, localising a copy (or just my'ing a local copy) will take care undoing changes as the program recurses.

      However, if the scalar in question is a ref to an array or hash, localising the ref will undo any changes to the ref itself, but it won't undo any changes to the structure referenced.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

      As far as I understand the problem, BrowserUk needs to read from the "old" datastructure, so local isn't really going to help, because it gives you a nice empty variable which effectively hides the old variable and does not do a complete copy of the datastructure.

      Arjen

      Your right. Letting the language do it is the right way to do it. What I am doing (using Clone) is to localise subtree's of the data structure into my'd vars at any given level of recursion and then allowing perl to undo this as it unwinds. See Re: Re: Deepcopy of complex structures. for somewhat more verbose (though not necessarilly clearer explanation :)

      With regard an example of the structure. If I ever get the algorithm to work and if it works as well as I hope, I'm sure to post the code here.

      If it doesn't I'll probably keep real quiet about it:)


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Deepcopy of complex structures. (resolved)
by BrowserUk (Patriarch) on Jan 21, 2003 at 12:53 UTC

    My thanks to iguanodon for the reference and install trace for Clone and Jeffa for the quickstart on its use. Special thanks to Merlyn for the reference to his article which told me what to look for in testing, and to everyone else for their help.

    As you can see from the Benchmark below, not all deepcopy methods are equal in terms of performance. (Data::Dumper isn't tested because I couldn't work out how to use it for this. See comments in code below).

    C:\test>228543 Benchmark: running CloneCopy, DeepCopy, StorableCopy , each for at least 1 CPU seconds ... CloneCopy: 1 wallclock secs ( 1.10 usr + 0.01 sys = 1.11 CPU) @ +27625.00/s (n=30719) DeepCopy: 1 wallclock secs ( 1.10 usr + 0.00 sys = 1.10 CPU) @ +20539.93/s (n=22635) StorableCopy: 1 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ +541.67/s (n=559) Rate StorableCopy DeepCopy CloneCopy StorableCopy 542/s -- -97% -98% DeepCopy 20540/s 3692% -- -26% CloneCopy 27625/s 5000% 34% -- Difference found!

    As you can see, Clone wins hands down, with my own (modified from the original) attempt coming a deceptively creditable second.

    What I mean by deceptively, is that despite it's relatively tardy performance, Storable really takes second place honours because, as is indicated by the "Differences found!" output at the bottom, not all the routines do the same thing. As was pointed out above by ihb, my DeepCopy routine does handle recursive structures. The difference is clearly shown in the side-by-side Dumper output below. A second flaw is that it doesn't handle bless'd references at all. I haven't tested the others for this as it isn't a requirement for my purposes.

    Data::Dumper comparison of cloned structures

    BTW: If anyone has a (reference to a) better version of my multicheck() routine, preferably one that will tell me which of the things passed differed if only one does, I'd love to see it.


    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Deepcopy of complex structures.
by Anonymous Monk on Jan 21, 2003 at 03:56 UTC
    The snippet I'm pasting in below may help. The trick to upgrading it to do deep copies of circular structures is to store the parts already worked out in a hash and look them up before deep copying, then use any previously created part of the deep copy instead of making a fresh one - a sort of poor man's memoising. But I haven't had to do that yet, so I only put in a check count to prevent recursing too deep. Oh, and obviously it doesn't yet handle all possible types, just the ones I needed in my application. PML.
    sub deep_copy { # this subroutine returns a deep copy of the specified reference # it cannot cope with circular structures yet my $ref_prm = shift; my $remaining_depth = (shift) - 1; return undef if (not defined($ref_prm)); # allowed to be undefined my $ref_type = ref($ref_prm); if ($remaining_depth < 0) { warn "Excessive call depth, possible circular structure - " . "deep copy failing\n"; return undef; } return $ref_prm if (not $ref_type); # something, and will not be un +defined if ($ref_type eq "REF") { my $deeper_copy_ref = deep_copy($$ref_prm, $remaining_depth); # +recursive call return \$deeper_copy_ref; } if ($ref_type eq "SCALAR") { my $deeper_copy_scalar = $$ref_prm; return \$deeper_copy_scalar; } if ($ref_type eq "ARRAY") { my @deeper_copy_array = (); foreach my $copy_value (@$ref_prm) { # recursive calls push(@deeper_copy_array, deep_copy($copy_value, $remaining_de +pth)); } return \@deeper_copy_array; } if ($ref_type eq "HASH") { my %deeper_copy_hash = (); foreach my $copy_key (keys %$ref_prm) { # recursive calls $deeper_copy_hash{$copy_key} = deep_copy($ref_prm->{$copy_key +}, $remaining_depth); } return \%deeper_copy_hash; } die "There is something in $ref_prm that cannot be deep copied conv +eniently\n"; }
Re: Deepcopy of complex structures.
by bart (Canon) on Jan 21, 2003 at 11:39 UTC

    I'm writing a recursive routine that does destructive processing on a passed ref to an AoHoA... but as I return from each level of recursion, I need to 'undo' the changes made.

    ...

    Is there something better?

    Yeah... avoid doing the processing in a destructive manner, if you can. That surely sounds like the wrong approach. If you absolutely need it, you can keep track of what nodes you have already visited by storing a stringified reference in a hash, for example.

      Oh, if only life were so easy. :^)

      Basically, the algorithm recursively processes this data structure, and deeper levels of the recursion rely on earlier levels having removed branches of the structure. This pruning effectively stops previously processed and rejected branches of the data being reprocessed at any given level.

      However, as the recursion unwinds (having so far failed to resolve the solution), I need to restore the pruned branches in order that they may be considered when walking other branches taken higher (shallower) in the recursion process. The structure contains many cross-references at the leaf nodes. It is these that get pruned as the process progresses at the deeper levels, but that must be restored as the recursion unwinds.

      The routine only selectively clones subtrees at any given level and passes these in when it recurses, so it is, in effect, tracking the changes and unwinding them. but given that the nature and depth of the structure is variable between runs, coming up with a generalised alternative tracking mechanism is more complex (and costly) than cloning the subtrees to lexicals as I go.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Deepcopy of complex structures.
by fsn (Friar) on Jan 21, 2003 at 15:53 UTC
    Uhm... May I propose another angle on this: don't copy the data, since copying dta is generally a Bad Thing(tm). If the data size grows, so will time. I have a gut feeling it will grow some O(n^2), but I'm not sure.

    I suggest you try to just mark branches as 'off limits' for lower recursionlevels, either in the same data structure or perhaps in a similar datastructure on the side. There is a connection here to how you would implement a four-in-a-row-game (or reversi or..), marking something before taking a recursion and then removing that mark when you return. This way you don't have to alter the data in your structure, unless the rest of the algorithm needs to. Also, it might actually be easier to implement. But I would have to know more about what you are doing to say anything certain.

Re: Deepcopy of complex structures.
by Aristotle (Chancellor) on Jan 22, 2003 at 14:05 UTC
    I will chime in and say I don't think deep copying is the best approach here. Use a global hash to store off-limits areas instead; you can localize individual keys of a hash as you descend, so Perl will do the backtracking for you naturally. This solution will most likely be much more efficient in terms of both time and space.

    Makeshifts last the longest.