Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Need advice on checking two hashes values and keys

by perlynewby (Scribe)
on Jun 03, 2015 at 21:35 UTC ( [id://1129003]=perlquestion: print w/replies, xml ) Need Help??

perlynewby has asked for the wisdom of the Perl Monks concerning the following question:

building on my perl skills with incremental test cases I am making up as I go...but hashes are still a bit confusing so I need a little guidance on decoding how these work.

I made up some little program to use hashes.

check if the number is found in both files. I used exist for keys but the output is not coming out to be what I want, can you advice me on how to do this properly? I thought by checking the keys I'd get s single key=value but French numbering isn't there ...

1st data

uno = uno due = dos tre = tres quattro = quatro cinque = cinco sei = seis sette = siete otto = ocho nouve = nueve dieci =diez

2nd data *corrected typo on 2,3 in French.

uno = un due = deux tre = trois quattro = quatre cinque = cinq sei = six sette = sept dieci = dix

OUTPUT:Italian => Spanish , French

uno => uno, un due => dos , deux tre => tres , trois quattro => quatro, quatre cinque => cinco , cinq sei => seis, six sette => siete, sept dieci => diez, dix
use strict; use warnings; use Data::Dump qw(dump); use Storable; #don't know this yet but let's try soon. my %hash; #opening my file handles; adding recommendations with naming extention +s open my $in, '<',"./test_data.txt" or die ("can't open the file:$!\n") +; open my $in1,'<',"./test_data1.txt" or die ("can't open file : $!\n"); open my $out ,'>' ,"./test_data_out.txt" or die "can't open the file f +or write:$!\n"; open my $out1 ,'>',"./test_data_out1_no_match.txt" or die "can't open +file for write:$!\n"; while (<$in>){ chomp; my ($key, $value)= split (/\s*=\s*/); #greedy matching for better +regex coverage as per Ken $hash{$key}=$value; } close $in; while (<$in1>){ chomp; my ($key,$value) = split (/\s*=\s*/); #splits row into 2 col. #checks for keys that EXIST in both then prints...? if (exists $hash{$key}){ print $out "$key => $hash{$key} , $value \n"; } else { print $out1 "$key => $value \n"; } } close $in1; close $out; close $out1;

Replies are listed 'Best First'.
Re: Need advice on checking two hashes values and keys
by kcott (Archbishop) on Jun 03, 2015 at 23:56 UTC

    G'day perlynewby,

    You're sort of on the right track. Here's where I see problems:

    • You only need one hash — see code examples below.
    • Splitting on /\s*=\s/ won't work with "dieci =diez" because there's no whitespace after the '=': you just need to change that to /\s*=\s*/ (greedily matching zero or more whitespace characters either side of the '=').
    • I suspect where you really got lost was with the foreach loop: probably due to using two hashes in the first place.
    • In addition, just throwing random code at the problem is a big mistake. For instance, you don't use Storable and I see no reason to sort keys. While you're learning, adding code to see what it does can be very useful; just leaving it there afterwards becomes problematic.
    • While I don't see that it's caused a problem here, I'd recommend giving a bit more thought to your naming conventions. You have $in1 associated with the data2 file, while $out1 is associated with the data_out1 file: variable 1, 2nd file, labelled 2 vs. variable 1, 2nd file, labelled 1.

    Having pointed out places where there's problems, I will commend you on the IO: lexical filehandles; 3-argument open; checking return values; using $!. All good - well done! In the examples below, I've used Inline::Files. Read about it if you want. The only pertinent part is that I'm using a while loop with a filehandle: read it the same as your code, I'm just using a different filehandle.

    "check if the number is found in both files."

    Let's start with doing just that and nothing else.

    #!/usr/bin/env perl -l use strict; use warnings; use Inline::Files; my %seen; while (<ITES>) { ++$seen{(split)[0]}; } while (<ITFR>) { my $key = (split)[0]; print $key if $seen{$key}; } __ITES__ uno = uno due = dos tre = tres quattro = quatro cinque = cinco sei = seis sette = siete otto = ocho nouve = nueve dieci =diez __ITFR__ uno = un due = due tre = tris quattro = quatre cinque = cinq sei = six sette = sept dieci = dix

    Output:

    uno due tre quattro cinque sei sette dieci

    Now you have working code that does what you want. One hash; two while loops; no foreach required.

    Let's build on that to get the output you're after.

    #!/usr/bin/env perl -l use strict; use warnings; use Inline::Files; my %seen; while (<ITES>) { chomp; my ($key, $val) = split /\s*=\s*/; $seen{$key} = $val; } while (<ITFR>) { chomp; my ($key, $val) = split /\s*=\s*/; print "$key => $seen{$key}, $val" if $seen{$key}; } __ITES__ uno = uno due = dos tre = tres quattro = quatro cinque = cinco sei = seis sette = siete otto = ocho nouve = nueve dieci =diez __ITFR__ uno = un due = due tre = tris quattro = quatre cinque = cinq sei = six sette = sept dieci = dix

    Output:

    uno => uno, un due => dos, due tre => tres, tris quattro => quatro, quatre cinque => cinco, cinq sei => seis, six sette => siete, sept dieci => diez, dix

    As you can see, the basic structure of the code hasn't changed. The first while loop is almost identical to yours (with the regex fixed).

    The second while loop starts like yours. But then just uses the same print ... if $seen{$key}; from my first example; the only real difference is that, having captured more data, we now have more information to print.

    To learn more about hashes in Perl, see "perldata - Perl data types" and "perldsc - Perl Data Structures Cookbook".

    Lastly, you have spelling mistakes in your data. For instance, two and three in French are deux and trois. I'll leave you to check the rest.

    -- Ken

      thanks for the advice on improving. I will follow those.

      I like how you call the file in to be read with Inline::File module. however, I am using netbeans and I cdon't know how to load up a module from CPAN to this IDE. it seems to be asking for .nbm file while CPAN is providing a .PL extension. don't know if these are friendly to each other. any ideas?

        "I like how you call the file in to be read ..."

        No external files are involved. There's just data embedded in the script.

        "... with Inline::File module."

        The module I used, and provided a link to, was Inline::Files (with an 's' at the end). This module is about 2 weeks old. I simply installed it using the cpan utility, which comes with the standard Perl distribution, like this from the command line:

        $ cpan Inline::Files

        You can find Inline::File (no 's' at the end) on CPAN. That name is probably just a typo: it provides the module Inline::Files. That module, however, is about 12 years old. So, make sure you're accessing the Inline::Files module I originally indicated.

        "... I am using netbeans and I cdon't know how to load up a module from CPAN to this IDE."

        I've never used "netbeans". I can't help with this; perhaps another monk can.

        "it seems to be asking for .nbm file while CPAN is providing a .PL extension."

        You'd be better off posting a verbatim copy of what "it seems to be asking for" rather than this vague description. Here's the Inline::Files MANIFEST: that may, or may not, be useful.

        -- Ken

Re: Need advice on checking two hashes values and keys
by aaron_baugher (Curate) on Jun 03, 2015 at 22:18 UTC

    Your first loop is fine; it reads lines from the first file and puts them in a hash as keys and values. Your second loop is kind of a mess. You have a few choices:

    1. On each line, find its key in the hash and go ahead and print the key (Italian), the value already in the hash (Spanish), and the value found in the current file (French).
    2. On each line, save the key and value into a new hash. Then after the second loop, have a third loop that goes through one of the hashes and prints out the keys and their values from each hash.
    3. Instead of saving the values in two hashes as simple scalars, save them in a single hash as a two-element array. So the hash would be structured like this:
      $hash = ( uno => [ 'uno','un' ], due => [ 'dos','due' ], tre => [ 'tres','tris' ], # and so on );
      This would mean changing your first loop so that it stores keys and values as $hash{$key}[0] and those from the second loop as $hash{$key}[1].

    A problem with solutions #2 and #3 is that a hash is not ordered, so when you loop through the hash to print out the lines, they will not be in the order you want. To fix that, you would have to use an array of arrays instead of a hash, or keep a separate array of the keys to hold their order, or use a module that provides an ordered hash. If you use solution #1, you'll be printing them out in the same order you find them in the second file, which appears to be what you want.

    Aaron B.
    Available for small or large Perl jobs and *nix system administration; see my home node.

      cool. nice explanation and advice on attack.

      your number 3 seems to be a little more advance where I am but I think I want to use this method. I see alot of potential here. although, the numerical ordering is nice, I think putting it in this form will make me practice more difficult hashes then, after, I can practice on order.

      so... more questions will follow on this 3rd type of exercise...thanks all for the help and advice.

        You're welcome. One way of doing it with a hash of arrays while maintaining the original order would be to save the line number in which you find each key together with its value. So your hash after the first loop would look like this:

        $hash = ( uno => [ 1, 'uno' ], due => [ 2, 'dos' ], tre => [ 3, 'tres' ], # and so on );

        Then after the second loop it would look like this:

        $hash = ( uno => [ 1, 'uno','un' ], due => [ 2, 'dos','due' ], tre => [ 3, 'tres','tris' ], # and so on );

        Then you'd need to learn how to sort the hash on the first element in each sub-array so that you can print them out in order. If you want to try that, then inside your first loop, you can get the line number to go with each key/value pair from the special $. variable.

        Aaron B.
        Available for small or large Perl jobs and *nix system administration; see my home node.

      I appreciate all the advice and examples to manipulate hash, hash table, code improvement and will do all of these with your help.

      should I create a new thread for each example or stick with this thread? maybe some other newbie can learn from it.

      Ok, I've been playing with hash ref to get a 2 element hash.

      hasn't worked yet. will you provide some instruction/teach/explain the error I did on hash ref? error it gave me.

      Can't use string ("dos") as an ARRAY ref while "strict refs" in use at C:\Users\Alberto\Documents\NetBeansProjects\PerlProject\Perl Essentials\hash_ref_6_4.pl line 34, <$in1> line 1.

      use strict; use warnings; use Data::Dump qw(dump); my %hash; #my file handles UNTIL I figure how to install the Inline::File module + to netbeans IDE open my $in, '<',"./test_data.txt" or die ("can't open the file:$!\n") +; open my $in1,'<',"./test_data1.txt" or die ("can't open file : $!\n"); open my $out ,'>' ,"./test_data_out.txt" or die "can't open the file f +or write:$!\n"; open my $out1 ,'>',"./test_data_out1_no_match.txt" or die "can't open +file for write:$!\n"; #creating hash while (<$in>){ chomp; my ($key,$value)= split(/\s*=\s*/); #conto di spazio prima o dopo +la parola $hash{$key}=$value; } close $in; #using the first hash while (<$in1>){ chomp; my($key,$value)=split/\s*=\s*/ ; #push the value to existing hash as to get reference if key exists # %hash =( It => [Spa Fre]) #using one hash as per Ken code suggestion?? push @{$hash{$key}},$value if $hash{$key}; #non so come funzio +na "push" print $out dump (\%hash); } close $in1; close $out; close $out1;

        (I'd suggest that you keep posting to this thread as long as you're working on the same problem, unless people stop responding to it.)

        Ok, let's say you want to create a hash of two-element arrays, with the hash keys being the Italian numbers, each one pointing to a reference to a two-element array holding the Spanish and French numbers, in that order. Then as you're going through the first loop (Italian = Spanish), you need to insert the Spanish numbers as the first element of an array rather than a simple value:

        $hash{$italian} = [ $spanish ];

        The square brackets return a reference to an array, which is a scalar that can be stored as a value in the hash. So now it looks like this, with references to one-element arrays as the values:

        $hash = ( uno => [ 'uno' ], due => [ 'dos' ], # and so on );

        Then in the second loop, you need to add the French numbers to the arrays corresponding to their matching Italian hash keys. There are two ways you could do this:

        # by assigning directly to the second element of the sub-array $hash{$italian}[1] = $french; # or by dereferencing the sub-array pointed to by the hash value # and pushing the new value onto the end of that array push @{$hash{$italian}}, $french; # Either way, you'll end up with: $hash = ( uno => [ 'uno','un' ], due => [ 'dos','deux' ], # and so on );

        Then when you're ready to print them out, you loop through the keys of the hash, printing the key and the elements of the sub-array as you wish:

        for my $key (keys %hash){ print $key, ' => ', join ' , ', @{$hash{$key}}; # dereference sub-ar +ray print "\n"; }

        The trick is keeping track of what level of the structure you're dealing with, and getting the sigils (and arrows, if necessary) right for pointing to the right things, whether values or references.

        Aaron B.
        Available for small or large Perl jobs and *nix system administration; see my home node.

Re: Need advice on checking two hashes values and keys
by FreeBeerReekingMonk (Deacon) on Jun 03, 2015 at 21:51 UTC

    Uh... you iterate over the file while (<$in1>){ but inside that loop you iterate AGAIN? over foreach my $key (sort keys %hash){
    That does not make much sense. A hash is made such that you do not need to iterate over them. You are treating the hash like it was an array.

    while (<$in1>){ chomp; my ($key,$value) = split (/\s*=\s/); if (exists $hash{$key}){ print $out "$key => $hash{$key}, $value \n"; }else{ print $out1 "$key => $value \n"; } }

      GRR, Yes, the iteration was an previous attempt to use all "IF THEN " checks on the Keys and values before abandoning it and deciding to ask for help so I can learn hashes by coding practice. I should have commented it out foreach loop.

      your code did clear a bit on hashes for me though. I was wrongly thinking how "exists" worked. I wrongly infered that if key "exists" in both files,it would automatically keep the 1 key and ,regardless if there were many identical keys with other values, and append the new values to it. going to reread that section again. thanks.

      will soon play with "define" as per the other code example...so much to learn on hashes.

      I get this error but code didn't crashed...why? what does this mean? Use of uninitialized value $value in concatenation (.) or string at C:\Users\Alberto\Documents\NetBeansProjects\PerlProject\Perl Essentials\giocando_con_il'ordinamento_hashes.pl line 39, <$in1> line 9.

        never mind on error on concatenation...it was my test_data format issue...extra spaces trailing on one file.

Re: Need advice on checking two hashes values and keys
by GotToBTru (Prior) on Jun 03, 2015 at 21:52 UTC

    You reuse the variable $key and I think it might be confusing things for you. The if in your foreach loop will always be true because $key comes from keys in %hash.

    while (<$in1>) { chomp; my ($key1,$value1) = split (/\s*=\s*); if defined($hash{$key1}) { print "$key1 is in both!\n" } }
    Dum Spiro Spero
Re: Need advice on checking two hashes values and keys
by Laurent_R (Canon) on Jun 04, 2015 at 15:05 UTC
    TIMTOWTDI, There is more than one way to do it.

    Given the nature of the data (and depending how it is supposed to be used later), I would probably use an array of hashes (AoH), something like this:

    my @numbers = ( undef, { it => "uno", sp => "uno", fr => "un"}, { it => "due", sp => "dos", fr => "deux"}, { it => "tre", sp => "tres", fr => "trois"}, # ... );
    which yields a structure like this:
    0 ARRAY(0x6004f9c80) 0 undef 1 HASH(0x600636430) 'fr' => 'un' 'it' => 'uno' 'sp' => 'uno' 2 HASH(0x6005d18a8) 'fr' => 'deux' 'it' => 'due' 'sp' => 'dos' 3 HASH(0x6005d1920) 'fr' => 'trois' 'it' => 'tre' 'sp' => 'tres'
    The advantage is that the array stays in order. Note that I created the first array element as undef, in order to have a natural correspondence between the element index and the numbers in he various languages (alternatively, I could have put a line for zero in all three languages). Each element of the array is a reference to a hash containing the number names in the various language.

    To access to the Italian name of 2, simply try:

    print $numbers[2]{it};
    which should happily print "due".

      Hey, someone said TIMTOWTDI. hehehe

      #!/usr/bin/perl # http://perlmonks.org/?node_id=1129003 use warnings; use strict; $_ = <<END; uno = uno due = dos tre = tres quattro = quatro cinque = cinco sei = seis sette = siete otto = ocho nouve = nueve dieci =diez uno = un due = due tre = tris quattro = quatre cinque = cinq sei = six sette = sept dieci = dix END print "$1 => $2, $3\n" while /^(\w+) = *(\w+)\b(?=.*\n\n.*^\1 = (\w+)) +/gms;

        OH MIO DIO! this is a cool way to do it.

        I need to learn ALL KIND of types hashes, hashrefs, hash table,..., for now since I get confused on them but using regex looks cool and will try this method too. pretty cool.

Re: Need advice on checking two hashes values and keys
by Random_Walk (Prior) on Jun 04, 2015 at 14:40 UTC

    Here is one without the hash, using a table. It does make me think a database would be better for this application :-) but perhaps we should looks at the real application as being learning Perl...

    /usr/bin/perl use strict; use warnings; my @numbers = ( [qw( Italian Spanish French English Welsh )], [qw( uno uno un one un )], [qw( due dos deux two dau )], [qw( tre tres trois three tri )], [qw( quattro quatro quatre four pedwar )], [qw( cinque cinco cinq five pump )], [qw( sei seis six six chwech )], [qw( sette siete sept seven saith )], [qw( otto ocho huit eight wyth )], [qw( nouve nueve neuf nine naw )], [qw( dieci diez dix ten deg )], ); print "Please enter number to translate\n"; my $num = <>; while ($num) { print "You typed $num\n"; chomp $num; for my $row (@numbers) { # dereference array, and look for out number in it next unless grep {/$num/} @$row; print "I found it: "; print join " <-> ", @$row; print "\n"; my $i = 0; $i ++ until $row->[$i] eq $num; print "It looks like it was in $numbers[0]->[$i]\n"; last; } print "Please enter another number to translate\n"; $num = <>; }

    The question this raises, is what happens when you type in uno? Altering it to know when a number has multiple matches, is left as an exercise for the reader.

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!
Re: Need advice on checking two hashes values and keys
by Random_Walk (Prior) on Jun 04, 2015 at 14:40 UTC

    Here is one without the hash, using a table. It does make me think a database would be better for this application :-)

    /usr/bin/perl use strict; use warnings; my @numbers = ( [qw( Italian Spanish French )], [qw( uno uno un )], [qw( due dos deux )], [qw( tre tres trois )], [qw( quattro quatro quatre )], [qw( cinque cinco cinq )], [qw( sei seis six )], [qw( sette siete sept )], [qw( otto ocho huit )], [qw( nouve nueve neuf )], [qw( dieci diez dix )], ); print "Please enter number to translate\n"; my $num = <>; while ($num) { print "You typed $num\n"; chomp $num; for my $row (@numbers) { # dereference array, and look for our number in it next unless grep {/$num/} @$row; print "I found it: "; print join " <-> ", @$row; print "\n"; my $i = 0; $i ++ until $row->[$i] eq $num; print "It looks like it was in $numbers[0]->[$i]\n"; last; } print "Please enter another number to translate\n"; $num = <>; }

    The question this raises, is what happens when you type in uno? Altering it to know when a number has multiple matches, is left as an exercise for the reader.

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!

      I enjoyed your program and finding out what happens when uno is typed

      meant to tell you earlier but I now using your program to learn and practice other little stuff in perl. thanks!

        Hi perlynewby,

        Thanks for the feedback. Nice to know I was of some help.

        Cheers,
        R.

        Pereant, qui ante nos nostra dixerunt!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1129003]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-23 09:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found