Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

How best to compare hash values?

by jimbass (Novice)
on May 05, 2010 at 21:24 UTC ( [id://838578]=perlquestion: print w/replies, xml ) Need Help??

jimbass has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone, I'm obviously new and have a problem understanding the right tool to use to make mathematical comparisons between the values of two different hashes that (should) have the same keys.

I have many point to multi-point radios at work, and the signal strengths can be tested by using telnet. I hacked a script together using expect to get on the radios in question and run the tests, then used shell tools like grep and cut to get the data reduced to just what (I think) I need.

The data ends up in a hash, and it has this form:

%hash1 = ("1", "20", "2", "20", "4", "19", "5", "20", "6", "18"); %hash2 = ("1", "19", "2", "20", "4", "16", "5", "19", "6", "20");

I've read through google about tools (Data::Compare) that will compare the entire hash and report if they are identical or not, but that isn't what I'm looking to do. I know that except for one time in 1000, %hash1 and %hash2 won't be identical.

What I'm looking to accomplish is to have perl look at each key/value in the first hash, and simply compare it to the same key/value in the second.

The results I'd like to see based on the example above is something like:

1 (20-19)=1 2 (20-20)=0 4 (19-16)=3 5 (20-19)=1 6 (18-20)=-2

What I'm not grasping is how to get perl to compare the individual key->value to the second key->value.

I probably made it seem like I'm looking for the output to be a 3rd hash. That wouldn't be bad, but it also isn't necessary, all I need is the difference in scores. The examples are real values, everything in the hashes are numbers, there aren't any strings. The numbers do get a bit uglier (the second hash is made up of the averages of several tests).

Its also distinctly possible that I'm going about this in completely the wrong way, and if so please point me in the right direction!

Thanks!

Replies are listed 'Best First'.
Re: How best to compare hash values?
by kennethk (Abbot) on May 05, 2010 at 21:37 UTC
    Is there a reason you are using hashes and not arrays? I ask because your keys are only numbers, and nearly sequential numbers at that - this would seem to be the classic example of when to use arrays.

    But, assuming you want to do this with hashes, you would first iterate over a sorted key list (see the FAQ How do I sort a hash (optionally by value instead of key)?) and then iterate over that list and output your desired result (How do I process an entire hash?). Something like:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %hash1 = ("1", "20", "2", "20", "4", "19", "5", "20", "6", "18"); my %hash2 = ("1", "19", "2", "20", "4", "16", "5", "19", "6", "20"); my %differences; for my $key ( sort {$a <=> $b} keys %hash1) { $differences{$key} = $hash1{$key} - $hash2{$key}; print "$key ($hash1{$key} - $hash2{$key}) = $differences{$key}\n"; } print "\n", Dumper \%differences; __END__ 1 (20 - 19) = 1 2 (20 - 20) = 0 4 (19 - 16) = 3 5 (20 - 19) = 1 6 (18 - 20) = -2 $VAR1 = { '6' => -2, '4' => 3, '1' => 1, '2' => 0, '5' => 1 };

    and using arrays instead:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array1 = (undef,20,20,19,20,undef,18); my @array2 = (undef,19,20,16,19,undef,20); my @differences; for my $i (0 .. $#array1) { next unless defined $array1[$i]; $differences[$i] = $array1[$i] - $array2[$i]; print "$i ($array1[$i] - $array2[$i]) = $differences[$i]\n"; } print "\n", Dumper \@differences; __END__ 1 (20 - 19) = 1 2 (20 - 20) = 0 4 (19 - 16) = 3 5 (20 - 19) = 1 6 (18 - 20) = -2 $VAR1 = [ undef, 1, 0, 3, 1, undef, -2 ];

    Update: Added storage of differences

      Thanks guys!

      The reason I went with a hash instead of an array is because I need to know the difference and the key at the same time, IE it is key 1 that has a difference of 1, or key 6 that has a difference of -2. Each key is a radio, and I need to know where to send the crew to fix the radio that fell out of alignment.

      I'm not opposed to using arrays, I simply thought since I needed the value relative to the key, a hash was the way to go.

      It is possible that I get a key in one hash that doesn't appear in another. If a new radio is added without me changing the "perfect" hash then the 'unique' key would appear in hash2, and if a radio loses power or connection then the 'unique' key would appear in hash1. Is there a simple escape from that problem, or should I just let it error out?

        But if your keys are just numbers, then the array index can be the number you need. Note that in my example, I used undef as a place holder for array elements that did not have an entry. Unless your indices are not even remotely sequential or you have non-integer indices, an array will be faster, take less memory and will apply a useful constraint on your keys.

        If there is a possibility that your arrays or hashes do not have equivalent keys, it is considered best practice to test prior to processing those values. Modifying the previously posted codes:

        Hashes:

        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %hash1 = ("1", "20", "2", "20", "4", "19", "5", "20", "6", "18", "7", "20"); my %hash2 = ("1", "19", "2", "20", "4", "16", "5", "19", "6", "20", "8", "20"); my %differences; for my $key ( sort {$a <=> $b} keys %hash1) { unless (defined $hash2{$key}) { print "$key is in \%hash1 but not \%hash2\n"; next; } $differences{$key} = $hash1{$key} - $hash2{$key}; print "$key ($hash1{$key} - $hash2{$key}) = $differences{$key}\n"; } print "\n", Dumper \%differences; __END__ 1 (20 - 19) = 1 2 (20 - 20) = 0 4 (19 - 16) = 3 5 (20 - 19) = 1 6 (18 - 20) = -2 7 is in %hash1 but not %hash2 $VAR1 = { '6' => -2, '4' => 3, '1' => 1, '2' => 0, '5' => 1 };

        Arrays:

        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array1 = (undef,20,20,19,20,undef,18,20); my @array2 = (undef,19,20,16,19,undef,20,undef,20); my @differences; my $top = $#array1 > $#array2 ? $#array1 : $#array2; for my $i (1 .. $top) { if (defined $array1[$i] and not defined $array2[$i]) { print "$i is in \@array1 but not \@array2\n"; } if (not defined $array1[$i] and defined $array2[$i]) { print "$i is in \@array2 but not \@array1\n"; } next unless defined $array1[$i] and defined $array2[$i]; $differences[$i] = $array1[$i] - $array2[$i]; print "$i ($array1[$i] - $array2[$i]) = $differences[$i]\n"; } print "\n", Dumper \@differences; __END__ 1 (20 - 19) = 1 2 (20 - 20) = 0 3 (19 - 16) = 3 4 (20 - 19) = 1 6 (18 - 20) = -2 7 is in @array1 but not @array2 8 is in @array2 but not @array1 $VAR1 = [ undef, 1, 0, 3, 1, undef, -2 ];

        Note that in the array case, I will not miss that the second set has a value that the first one is missing.

        You can use for example get_unique and get_complement from List::Compare to check what keys are missing.
Re: How best to compare hash values?
by choroba (Cardinal) on May 05, 2010 at 21:31 UTC
    Just go through all the keys in one hash and compare the values.
    foreach my $key ( sort keys %hash1 ) { print $hash1{$key} - $hash2{$key}, "\n"; }
    Problems may occur if a key is not present in both hashes, though.

      I generally do it this way too, but I check all keys that are in both. Something like:

      my %union = ( %hash1, %hash2 ); my @keys = sort keys %union; undef %union; foreach my $key ( @keys ) { if ( ! exists $hash1{$key} ) { print "key $key is not in hash 1\n"; } elsif ( ! exists $hash2{$key} ) { print "key $key is not in hash 2\n"; } ....

      - doug

Re: How best to compare hash values?
by Khen1950fx (Canon) on May 06, 2010 at 10:28 UTC
    I tried a variation on your question. What I wanted was a quick "visual" and diff of the arrays. I used Array::Heap and altered the heap.
    #!/usr/bin/perl use strict; use warnings; use Array::Heap; use Data::Dumper::Concise; use Text::Diff; use Text::Diff::Table; print "==\n"; my @arr1 = qw[20 20 19 20 18]; make_heap @arr1; print Dumper @arr1; print "==\n"; my @arr2 = qw[19 20 16 19 20]; make_heap @arr2; print Dumper @arr2; print "==\n"; my $file1 = \@arr1; my $file2 = \@arr2; my $diff = Text::Diff::Table->new; $diff = diff $file1, $file2, { STYLE => "Table" }; print $diff, "\n";
Re: How best to compare hash values?
by jerryg (Initiate) on May 06, 2010 at 16:17 UTC

    Here's my attempt at an answer. Personally speaking, I love hashes. You just have to be aware of the limitations of which approach you take (array vs. hash).

    I created a sub "isNull" as a homegrown replacement for "exists". The idea is to check your assumptions (and data) before you operate on that data.

    #!/usr/bin/perl -w use strict; sub keyExists; # Begin Main my %hash1 = (); # Initialize empty hash. Perhaps unnecessary? %hash1 = ( "1", "20", "2", "20", "4", "19", "5", "20", "10", "20", "6", "18"); my %hash2 = (); # Initialize empty hash. Perhaps unnecessary? %hash2 = ( "1", "19", "2", "20", "4", "16", "5", "19", "6", "20"); foreach my $thisKey (sort keys %hash1) { # Check whether key from hash2 returns null to protect from undefi +ned key for hash2 if (isNull(\%hash2, $thisKey)) { print "hash2 contains null value for key: $thisKey", "\n"; } else { my $result = $hash1{$thisKey} - $hash2{$thisKey}; print "$thisKey $hash1{$thisKey} minus $hash2{$thisKey} equals + $result", "\n"; } } # End foreach exit 0; # End Main sub isNull { my $hashref = shift; # Reference to hash my $key = shift; # key to check my $rc = 0; if ($hashref->{$key}) { $rc = 0; # Hash returns non-null value } else { $rc = 1; # Hash returns Null value } return $rc; } # End sub isNull
      my %hash1 = (); # Initialize empty hash. Perhaps unnecessary?
      Yes, this is unnecessary. Either:
      my %hash1;

      or:

      my %hash1 = ( "1", "20", "2", "20", "4", "19", "5", "20", "10", "20", "6", "18");

      Fat commas are nice too: 1 => 20,

      I created a sub "isNull" as a homegrown replacement for "exists". The idea is to check your assumptions (and data) before you operate on that data.
      Your isNull sub is not need with your data. Try:
      unless ($hash2{$thisKey}) {

      instead of:

      if (isNull(\%hash2, $thisKey)) {

        Thanks for the additional code and such guys. I've come to see there's quite a few different ways of accomplishing the same thing!

        I'm working with arrays now rather than hashes, and a mis-coding on my part showed that I may not even need arrays for some thimgs, simple variables seem to capture the data too!

        I'm working now on capturing output from different short scripts, and it seems that rather than call other scripts, its best to do everything possible from a single perl script. Using strict and warnings is forcing me to do a fair amount of reading, which is helpful. Thanks again to everyone!

Re: How best to compare hash values?
by marvin13 (Initiate) on May 07, 2010 at 16:29 UTC
    == Removed ==

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://838578]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-16 21:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found