Description: |
Suppose you have a hash of replacements to perform.
my %replace = ( qw/
COM SEC
MOB HIS
MOC COM
ICM1 INO
ICM2 INO
EU HIS
CY GRE
AE MOB
IN ICC
GR GRE
MH MOC
CO MOC
MO HIS
/ );
That is, if you see 'COM', you must replace it with 'SEC'. For a real world example, my company restructures itself internally every once in a while, and while the people remain the same, the departments change name, merge and split. The above example represents the fields in database records to be updated to reflect the change.
Looking carefully, one sees that the transfer COM -> SEC has to be applied before MOC -> COM. If performed in the wrong order, MOC will be transformed to COM and thence to SEC, which would be a Bad Thing.
We have to look for the case where a hash value x (the replacement) matches a hash key y (what is to be replaced). If so, the tranformation of hash key y needs to be performed first, to clear the way for the hash key (whatever it is) to transform itself to x in a subsequent pass.
Thus, we want an array of references to tranform hashes. If we get the order right, we are guaranteed of not clobbering anything in our transforms.
Another problem to worry about is in the case of COM -> SEC and SEC -> COM. In this case we have a loop. The only way to solve this problem is to weaken the loop by introducing an intermediate step: COM -> ZZZ, SEC -> COM, ZZZ -> SEC (where ZZZ is a random string guaranteed not to exist among the set of transforms). I didn't run into this problem (I should be thankful for small mercies) but I added the code to at least detect the problem. Solving it is left as an exercise to the reader.
I'm interested in feedback. Is there a better way? Simpler? Non-recursive? A better way of dealing with the @order array? (I don't like passing it as a parameter, I think it would be more elegant to say my @order = demangle( \%replace). And demangle is a silly name, but I can't think of a good action/verb that describes what I'm doing. Hell, even a better title for this snipper would help (something that will help people search for it -- thanks extremely). |
#! /usr/bin/perl -w
use strict;
my %replace = ( qw/
COM SEC
MOB HIS
MOC COM
ICM1 INO
ICM2 INO
EU HIS
CY GRE
AE MOB
IN ICC
GR GRE
MH MOC
CO MOC
MO HIS
/);
my @order;
demangle( \%replace, \@order );
sub demangle {
my $r = shift;
my $order = shift;
my %invert;
@invert{ values %$r } = keys %$r;
my( %okay, %collide );
for my $key( sort keys %$r ) {
if( exists $invert{$key} ) {
$collide{$key} = $r->{$key};
}
else {
$okay{$key} = $r->{$key};
}
}
unshift @$order, \%okay;
if( %collide ) {
my @loop_keys = sort keys %collide;
my @loop_vals = sort values %collide;
my $is_loop = 1;
for( my $n = 0; $n < scalar @loop_keys; ++$n ) {
if( $loop_keys[$n] ne $loop_vals[$n] ) {
$is_loop = 0;
last;
}
}
if( $is_loop ) {
warn "\t$_\t$collide{$_}\n" for sort keys %collide;
die "loop in transforms detected, bailing out\n";
}
demangle( \%collide, $order );
}
}
my $pass = 0;
for my $r( @order ) {
++$pass;
print "Pass $pass\n";
for my $key( keys %$r ) {
print "\t$key -> $r->{$key}\n";
}
}
__END__
# produces:
Pass 1
COM -> SEC
Pass 2
MOB -> HIS
MOC -> COM
Pass 3
ICM1 -> INO
ICM2 -> INO
EU -> HIS
CY -> GRE
IN -> ICC
GR -> GRE
MH -> MOC
CO -> MOC
AE -> MOB
MO -> HIS
Re: Ordering hash replacements to avoid clobbering things
by Corion (Patriarch) on Feb 26, 2003 at 14:27 UTC
|
There is also Regexp::Subst::Parallel, which does the same thing you do (I guess).
perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The
$d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider
($c = $d->accept())->get_request(); $c->send_response( new #in the
HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
| [reply] [d/l] |
|
| [reply] |
Re: Ordering hash replacements to avoid clobbering things
by Hofmator (Curate) on Feb 26, 2003 at 17:04 UTC
|
Here goes, interesting little problem ... my solution is non-recursive, can handle loops and allows to write my @order = demangle(\%replace);.
Update
After sleeping over it I've greatly simplified the algoritm, now I think it's elegant. You have to decide yourself if this algorithm is nicer than the rather elegant recursive solution.
sub demangle {
my $r = shift;
my %illegal;
@illegal{%$r} = ();
my @chains;
LOOP:
while (my($k,$v) = each %$r) {
for my $c (@chains) {
if ($c->[-1] eq $k) { # append to end of chain
push @$c, $v;
next LOOP;
};
if ($c->[0] eq $v) { # prepend to start of chain
unshift @$c, $k;
next LOOP;
}
}
push @chains, [$k, $v]; # create new chain
}
# fix circular replacements
for my $c (@chains) {
if ($c->[0] eq $c->[-1]) { # we have a circle
my $new_key;
do {
$new_key = join '', map { ('a'..'z')[rand 26] } 1..8;
} while exists $illegal{$new_key};
$illegal{$new_key}++;
unshift @$c, $new_key;
push @$c, $new_key;
}
}
my @order;
while (@chains) {
push @order, { map { $_->[-2] => pop @$_ } @chains };
@chains = grep @$_ > 1, @chains;
}
return @order;
}
-- Hofmator | [reply] [d/l] [select] |
Re: Ordering hash replacements to avoid clobbering things
by xmath (Hermit) on Feb 26, 2003 at 14:19 UTC
|
•Update: I've retracted this post, it doesn't apply to the situation (I missed the part about it being db records that need to be updated)
This stuff below is only useful if you need to do the substitutions in a string.
| [reply] [d/l] [select] |
|
I'm not interested in regexps as I'm not working with strings. This is being used to update database fields, so unless you're doing some really fancy tie-ing, regexps aren't gonna fly :)
For reference, the heart of the code that uses this snippet looks like this:
my $db = DBI->connect( $DSN, 'user', 'sekret', {AutoCommit => 0})
or die "Couldn't connect to database $DSN: ${\DBI->errstr}\n";
END { $db and $db->disconnect }
my $ss = $db->prepare( q{update t set department = ? where department
+= ?});
die unless $ss;
my $ok = 1;
REPLACE: for my $r( @order ) {
for my $key( keys %$r ) {
print "$key -> $r->{$key}\n";
if( !$ss->execute( $r->{$key}, $key )) {
warn "cannot update $key to $r->{$key}\n${\$ss->errstr}\n"
+;
$ok = 0;
last REPLACE;
}
}
}
$ok ? $db->commit : $db->rollback;
NB: The above code is condensed from production code. I have excised things that have no relevance to the example, so it just may or may not compile :)
print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u' | [reply] [d/l] |
|
•Update: ok, I missed the line "The above example represents the fields in database records to be updated to reflect the change." -- my apologies
| [reply] |
Re: Ordering hash replacements to avoid clobbering things
by extremely (Priest) on Feb 26, 2003 at 20:19 UTC
|
Since you asked... I call this update chaining.
And it reminds me of the old story about demonstrating the "new" features of a word processor (I heard it about the old WordStar, severely dating myself...) The person shows her boss how you can change "all the 'a's to 'e's in a document." They then say "...and you can change them right back the same way. Uh. Oh." Of course, at this point, they discover that the Undo feature only backsteps one event. :)
--
$you = new YOU;
honk() if $you->love(perl)
| [reply] |
Re: Ordering hash replacements to avoid clobbering things (update chaining)
by BrowserUk (Patriarch) on Mar 01, 2003 at 05:18 UTC
|
Somewhat belated, but I had fun exploring different methods. Whether this is better in any way I doubt, but somebody might find it interesting.
#! perl -slw
use strict;
use Data::Dumper;
sub unwindDependancies{
my ($hashref) = @_;
my %copy = %{$hashref};
my @keys = keys %$hashref;
my @deps;
for (@keys) {
next unless exists $copy{$_};
push @deps, [ $_ ];
while( exists $hashref->{$_}) {
unshift( @{ $deps[-1] }, $_ = $hashref->{$_} );
@{$deps[-1]} > @keys and die('Circular reference found inv
+olving:', "@{$deps[-1]}");
delete $copy{$_};
}
}
my %h;
for (@deps) {
my $r = \%h;
$r = exists $r->{$_} ? $r->{$_} : ($r->{$_} = {}) for @$_;
}
#print Dumper \%h;
my (@order, @stack);
my $r=\%h;
my $t=0;
{
while(my ($key, $val) = each %$r) {
push @order, $key;
push(@stack, $r), $r = $val, next if keys %$val;
}
$r = pop(@stack), redo if @stack;
}
# print "@order";
return @order;
}
my %test = ( qw[A B B C D E E F F G G C J H H F I H K H N M
+ O P P Q Q N S U T V]);
print join' ', unwindDependancies \%test;
my %replace = ( qw/
COM SEC
MOB HIS
MOC COM
ICM1 INO
ICM2 INO
EU HIS
CY GRE
AE MOB
IN ICC
GR GRE
MH MOC
CO MOC
MO HIS
/ );
print join' ', unwindDependancies \%replace;
__DATA__
C:\test>238721
V T C G F H I J K E D B A U S M N Q P O
ICC IN SEC COM MOC CO MH HIS EU MO MOB AE GRE CY GR INO ICM1 ICM2
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
| [reply] [d/l] |
|
|