Ordering hash replacements to avoid clobbering things (update chaining)

Suppose you have a hash of replacements to perform.

my %replace = ( qw/
    COM  SEC
    MOB  HIS
    MOC  COM
    ICM1 INO
    ICM2 INO
    EU   HIS
    CY   GRE
    AE   MOB
    IN   ICC
    GR   GRE
    MH   MOC
    CO   MOC
    MO   HIS
/ );
[download]

That is, if you see 'COM', you must replace it with 'SEC'. For a real world example, my company restructures itself internally every once in a while, and while the people remain the same, the departments change name, merge and split. The above example represents the fields in database records to be updated to reflect the change.

Looking carefully, one sees that the transfer COM -> SEC has to be applied before MOC -> COM. If performed in the wrong order, MOC will be transformed to COM and thence to SEC, which would be a Bad Thing.

We have to look for the case where a hash value x (the replacement) matches a hash key y (what is to be replaced). If so, the tranformation of hash key y needs to be performed first, to clear the way for the hash key (whatever it is) to transform itself to x in a subsequent pass.

Thus, we want an array of references to tranform hashes. If we get the order right, we are guaranteed of not clobbering anything in our transforms.

Another problem to worry about is in the case of COM -> SEC and SEC -> COM. In this case we have a loop. The only way to solve this problem is to weaken the loop by introducing an intermediate step: COM -> ZZZ, SEC -> COM, ZZZ -> SEC (where ZZZ is a random string guaranteed not to exist among the set of transforms). I didn't run into this problem (I should be thankful for small mercies) but I added the code to at least detect the problem. Solving it is left as an exercise to the reader.

I'm interested in feedback. Is there a better way? Simpler? Non-recursive? A better way of dealing with the @order array? (I don't like passing it as a parameter, I think it would be more elegant to say my @order = demangle( \%replace). And demangle is a silly name, but I can't think of a good action/verb that describes what I'm doing. Hell, even a better title for this snipper would help (something that will help people search for it -- thanks extremely).

#! /usr/bin/perl -w

use strict;

my %replace = ( qw/
    COM  SEC
    MOB  HIS
    MOC  COM
    ICM1 INO
    ICM2 INO
    EU   HIS
    CY   GRE
    AE   MOB
    IN   ICC
    GR   GRE
    MH   MOC
    CO   MOC
    MO   HIS
/);

my @order;
demangle( \%replace, \@order );

sub demangle {
    my $r     = shift;
    my $order = shift;

    my %invert;
    @invert{ values %$r } = keys %$r;

    my( %okay, %collide );

    for my $key( sort keys %$r ) {
        if( exists $invert{$key} ) {
            $collide{$key} = $r->{$key};
        }
        else {
            $okay{$key} = $r->{$key};
        }
    }
    unshift @$order, \%okay;

    if( %collide ) {
        my @loop_keys = sort keys %collide;
        my @loop_vals = sort values %collide;
        my $is_loop = 1;
        for( my $n = 0; $n < scalar @loop_keys; ++$n ) {
            if( $loop_keys[$n] ne $loop_vals[$n] ) {
                $is_loop = 0;
                last;
            }
        }
        if( $is_loop ) {
            warn "\t$_\t$collide{$_}\n" for sort keys %collide;
            die "loop in transforms detected, bailing out\n";
        }
        demangle( \%collide, $order );
    }
}

my $pass = 0;
for my $r( @order ) {
    ++$pass;
    print "Pass $pass\n";
    for my $key( keys %$r ) {
        print "\t$key -> $r->{$key}\n";
    }
}
__END__
# produces:
Pass 1
        COM -> SEC
Pass 2
        MOB -> HIS
        MOC -> COM
Pass 3
        ICM1 -> INO
        ICM2 -> INO
        EU -> HIS
        CY -> GRE
        IN -> ICC
        GR -> GRE
        MH -> MOC
        CO -> MOC
        AE -> MOB
        MO -> HIS
[download]

Comment on Ordering hash replacements to avoid clobbering things (update chaining) Select or Download Code

Replies are listed 'Best First'.
Re: Ordering hash replacements to avoid clobbering things by Corion (Patriarch) on Feb 26, 2003 at 14:27 UTC
There is also Regexp::Subst::Parallel, which does the same thing you do (I guess). `perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web` [download]	[reply] [d/l]
Re: Re: Ordering hash replacements to avoid clobbering things by xmath (Hermit) on Feb 26, 2003 at 14:33 UTC
Yea, that'll work too (although I think using a module for this is a bit overkill; especially since they're simple regex-free substitutions)	[reply]
Re: Ordering hash replacements to avoid clobbering things by Hofmator (Curate) on Feb 26, 2003 at 17:04 UTC
Here goes, interesting little problem ... my solution is non-recursive, can handle loops and allows to write `my @order = demangle(\%replace);`. Update After sleeping over it I've greatly simplified the algoritm, now I think it's elegant. ~~You have to decide yourself if this algorithm is nicer than the rather elegant recursive solution.~~ sub demangle { my $r = shift; my %illegal; @illegal{%$r} = (); my @chains; LOOP: while (my($k,$v) = each %$r) { for my $c (@chains) { if ($c->[-1] eq $k) { # append to end of chain push @$c, $v; next LOOP; }; if ($c->[0] eq $v) { # prepend to start of chain unshift @$c, $k; next LOOP; } } push @chains, [$k, $v]; # create new chain } # fix circular replacements for my $c (@chains) { if ($c->[0] eq $c->[-1]) { # we have a circle my $new_key; do { $new_key = join '', map { ('a'..'z')[rand 26] } 1..8; } while exists $illegal{$new_key}; $illegal{$new_key}++; unshift @$c, $new_key; push @$c, $new_key; } } my @order; while (@chains) { push @order, { map { $_->[-2] => pop @$_ } @chains }; @chains = grep @$_ > 1, @chains; } return @order; } [download] -- Hofmator	[reply] [d/l] [select]
Re: Ordering hash replacements to avoid clobbering things by xmath (Hermit) on Feb 26, 2003 at 14:19 UTC
•Update: I've retracted this post, it doesn't apply to the situation (I missed the part about it being db records that need to be updated) This stuff below is only useful if you need to do the substitutions in a string. Read more... (2 kB)	[reply] [d/l] [select]
Re:x2 Ordering hash replacements to avoid clobbering things by grinder (Bishop) on Feb 26, 2003 at 16:17 UTC
I'm not interested in regexps as I'm not working with strings. This is being used to update database fields, so unless you're doing some really fancy tie-ing, regexps aren't gonna fly :) For reference, the heart of the code that uses this snippet looks like this: my $db = DBI->connect( $DSN, 'user', 'sekret', {AutoCommit => 0}) or die "Couldn't connect to database $DSN: ${\DBI->errstr}\n"; END { $db and $db->disconnect } my $ss = $db->prepare( q{update t set department = ? where department += ?}); die unless $ss; my $ok = 1; REPLACE: for my $r( @order ) { for my $key( keys %$r ) { print "$key -> $r->{$key}\n"; if( !$ss->execute( $r->{$key}, $key )) { warn "cannot update $key to $r->{$key}\n${\$ss->errstr}\n" +; $ok = 0; last REPLACE; } } } $ok ? $db->commit : $db->rollback; [download] NB: The above code is condensed from production code. I have excised things that have no relevance to the example, so it just may or may not compile :) print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'	[reply] [d/l]
Re: Re:x2 Ordering hash replacements to avoid clobbering things by xmath (Hermit) on Feb 26, 2003 at 16:33 UTC
•Update: ok, I missed the line "The above example represents the fields in database records to be updated to reflect the change." -- my apologies	[reply]
Re: Ordering hash replacements to avoid clobbering things by extremely (Priest) on Feb 26, 2003 at 20:19 UTC
Since you asked... I call this update chaining. And it reminds me of the old story about demonstrating the "new" features of a word processor (I heard it about the old WordStar, severely dating myself...) The person shows her boss how you can change "all the 'a's to 'e's in a document." They then say "...and you can change them right back the same way. Uh. Oh." Of course, at this point, they discover that the Undo feature only backsteps one event. :) -- $you = new YOU; honk() if $you->love(perl)	[reply]
Re: Ordering hash replacements to avoid clobbering things (update chaining) by BrowserUk (Patriarch) on Mar 01, 2003 at 05:18 UTC
Somewhat belated, but I had fun exploring different methods. Whether this is better in any way I doubt, but somebody might find it interesting. #! perl -slw use strict; use Data::Dumper; sub unwindDependancies{ my ($hashref) = @_; my %copy = %{$hashref}; my @keys = keys %$hashref; my @deps; for (@keys) { next unless exists $copy{$_}; push @deps, [ $_ ]; while( exists $hashref->{$_}) { unshift( @{ $deps[-1] }, $_ = $hashref->{$_} ); @{$deps[-1]} > @keys and die('Circular reference found inv +olving:', "@{$deps[-1]}"); delete $copy{$_}; } } my %h; for (@deps) { my $r = \%h; $r = exists $r->{$_} ? $r->{$_} : ($r->{$_} = {}) for @$_; } #print Dumper \%h; my (@order, @stack); my $r=\%h; my $t=0; { while(my ($key, $val) = each %$r) { push @order, $key; push(@stack, $r), $r = $val, next if keys %$val; } $r = pop(@stack), redo if @stack; } # print "@order"; return @order; } my %test = ( qw[A B B C D E E F F G G C J H H F I H K H N M + O P P Q Q N S U T V]); print join' ', unwindDependancies \%test; my %replace = ( qw/ COM SEC MOB HIS MOC COM ICM1 INO ICM2 INO EU HIS CY GRE AE MOB IN ICC GR GRE MH MOC CO MOC MO HIS / ); print join' ', unwindDependancies \%replace; __DATA__ C:\test>238721 V T C G F H I J K E D B A U S M N Q P O ICC IN SEC COM MOC CO MH HIS EU MO MOB AE GRE CY GR INO ICM1 ICM2 [download] Examine what is said, not who speaks. 1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong. 2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible 3) Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke.	[reply] [d/l]


Syntactic Confectionery Delight
	PerlMonks