perlynewby has asked for the wisdom of the Perl Monks concerning the following question:
building on my perl skills with incremental test cases I am making up as I go...but hashes are still a bit confusing so I need a little guidance on decoding how these work.
I made up some little program to use hashes.
check if the number is found in both files. I used exist for keys but the output is not coming out to be what I want, can you advice me on how to do this properly? I thought by checking the keys I'd get s single key=value but French numbering isn't there ...
1st data
uno = uno
due = dos
tre = tres
quattro = quatro
cinque = cinco
sei = seis
sette = siete
otto = ocho
nouve = nueve
dieci =diez
2nd data *corrected typo on 2,3 in French.
uno = un
due = deux
tre = trois
quattro = quatre
cinque = cinq
sei = six
sette = sept
dieci = dix
OUTPUT:Italian => Spanish , French
uno => uno, un
due => dos , deux
tre => tres , trois
quattro => quatro, quatre
cinque => cinco , cinq
sei => seis, six
sette => siete, sept
dieci => diez, dix
use strict;
use warnings;
use Data::Dump qw(dump);
use Storable; #don't know this yet but let's try soon.
my %hash;
#opening my file handles; adding recommendations with naming extention
+s
open my $in, '<',"./test_data.txt" or die ("can't open the file:$!\n")
+;
open my $in1,'<',"./test_data1.txt" or die ("can't open file : $!\n");
open my $out ,'>' ,"./test_data_out.txt" or die "can't open the file f
+or write:$!\n";
open my $out1 ,'>',"./test_data_out1_no_match.txt" or die "can't open
+file for write:$!\n";
while (<$in>){
chomp;
my ($key, $value)= split (/\s*=\s*/); #greedy matching for better
+regex coverage as per Ken
$hash{$key}=$value;
}
close $in;
while (<$in1>){
chomp;
my ($key,$value) = split (/\s*=\s*/); #splits row into 2 col.
#checks for keys that EXIST in both then prints...?
if (exists $hash{$key}){
print $out "$key => $hash{$key} , $value \n";
} else {
print $out1 "$key => $value \n";
}
}
close $in1;
close $out;
close $out1;
Re: Need advice on checking two hashes values and keys
by kcott (Archbishop) on Jun 03, 2015 at 23:56 UTC
|
G'day perlynewby,
You're sort of on the right track.
Here's where I see problems:
-
You only need one hash — see code examples below.
-
Splitting on /\s*=\s/ won't work with "dieci =diez" because there's no whitespace after the '=': you just need to change that to /\s*=\s*/ (greedily matching zero or more whitespace characters either side of the '=').
-
I suspect where you really got lost was with the foreach loop: probably due to using two hashes in the first place.
-
In addition, just throwing random code at the problem is a big mistake. For instance, you don't use Storable and I see no reason to sort keys. While you're learning, adding code to see what it does can be very useful; just leaving it there afterwards becomes problematic.
-
While I don't see that it's caused a problem here, I'd recommend giving a bit more thought to your naming conventions.
You have $in1 associated with the data2 file, while $out1 is associated with the data_out1 file:
variable 1, 2nd file, labelled 2 vs.
variable 1, 2nd file, labelled 1.
Having pointed out places where there's problems, I will commend you on the IO: lexical filehandles; 3-argument open; checking return values; using $!. All good - well done!
In the examples below, I've used Inline::Files.
Read about it if you want.
The only pertinent part is that I'm using a while loop with a filehandle: read it the same as your code, I'm just using a different filehandle.
"check if the number is found in both files."
Let's start with doing just that and nothing else.
#!/usr/bin/env perl -l
use strict;
use warnings;
use Inline::Files;
my %seen;
while (<ITES>) {
++$seen{(split)[0]};
}
while (<ITFR>) {
my $key = (split)[0];
print $key if $seen{$key};
}
__ITES__
uno = uno
due = dos
tre = tres
quattro = quatro
cinque = cinco
sei = seis
sette = siete
otto = ocho
nouve = nueve
dieci =diez
__ITFR__
uno = un
due = due
tre = tris
quattro = quatre
cinque = cinq
sei = six
sette = sept
dieci = dix
Output:
uno
due
tre
quattro
cinque
sei
sette
dieci
Now you have working code that does what you want.
One hash; two while loops; no foreach required.
Let's build on that to get the output you're after.
#!/usr/bin/env perl -l
use strict;
use warnings;
use Inline::Files;
my %seen;
while (<ITES>) {
chomp;
my ($key, $val) = split /\s*=\s*/;
$seen{$key} = $val;
}
while (<ITFR>) {
chomp;
my ($key, $val) = split /\s*=\s*/;
print "$key => $seen{$key}, $val" if $seen{$key};
}
__ITES__
uno = uno
due = dos
tre = tres
quattro = quatro
cinque = cinco
sei = seis
sette = siete
otto = ocho
nouve = nueve
dieci =diez
__ITFR__
uno = un
due = due
tre = tris
quattro = quatre
cinque = cinq
sei = six
sette = sept
dieci = dix
Output:
uno => uno, un
due => dos, due
tre => tres, tris
quattro => quatro, quatre
cinque => cinco, cinq
sei => seis, six
sette => siete, sept
dieci => diez, dix
As you can see, the basic structure of the code hasn't changed.
The first while loop is almost identical to yours (with the regex fixed).
The second while loop starts like yours.
But then just uses the same print ... if $seen{$key}; from my first example;
the only real difference is that, having captured more data, we now have more information to print.
To learn more about hashes in Perl, see "perldata - Perl data types" and "perldsc - Perl Data Structures Cookbook".
Lastly, you have spelling mistakes in your data. For instance, two and three in French are deux and trois. I'll leave you to check the rest.
| [reply] [d/l] [select] |
|
thanks for the advice on improving. I will follow those.
I like how you call the file in to be read with Inline::File module. however, I am using netbeans and I cdon't know how to load up a module from CPAN to this IDE. it seems to be asking for .nbm file while CPAN is providing a .PL extension. don't know if these are friendly to each other. any ideas?
| [reply] |
|
"I like how you call the file in to be read ..."
No external files are involved.
There's just data embedded in the script.
"... with Inline::File module."
The module I used, and provided a link to, was Inline::Files (with an 's' at the end). This module is about 2 weeks old. I simply installed it using the cpan utility, which comes with the standard Perl distribution, like this from the command line:
$ cpan Inline::Files
You can find Inline::File (no 's' at the end) on CPAN. That name is probably just a typo: it provides the module Inline::Files. That module, however, is about 12 years old.
So, make sure you're accessing the Inline::Files module I originally indicated.
"... I am using netbeans and I cdon't know how to load up a module from CPAN to this IDE."
I've never used "netbeans". I can't help with this; perhaps another monk can.
"it seems to be asking for .nbm file while CPAN is providing a .PL extension."
You'd be better off posting a verbatim copy of what "it seems to be asking for" rather than this vague description. Here's the Inline::Files MANIFEST: that may, or may not, be useful.
| [reply] [d/l] [select] |
Re: Need advice on checking two hashes values and keys
by aaron_baugher (Curate) on Jun 03, 2015 at 22:18 UTC
|
Your first loop is fine; it reads lines from the first file and puts them in a hash as keys and values. Your second loop is kind of a mess. You have a few choices:
- On each line, find its key in the hash and go ahead and print the key (Italian), the value already in the hash (Spanish), and the value found in the current file (French).
- On each line, save the key and value into a new hash. Then after the second loop, have a third loop that goes through one of the hashes and prints out the keys and their values from each hash.
- Instead of saving the values in two hashes as simple scalars, save them in a single hash as a two-element array. So the hash would be structured like this:
$hash = ( uno => [ 'uno','un' ],
due => [ 'dos','due' ],
tre => [ 'tres','tris' ],
# and so on
);
This would mean changing your first loop so that it stores keys and values as $hash{$key}[0] and those from the second loop as $hash{$key}[1].
A problem with solutions #2 and #3 is that a hash is not ordered, so when you loop through the hash to print out the lines, they will not be in the order you want. To fix that, you would have to use an array of arrays instead of a hash, or keep a separate array of the keys to hold their order, or use a module that provides an ordered hash. If you use solution #1, you'll be printing them out in the same order you find them in the second file, which appears to be what you want.
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] [select] |
|
cool. nice explanation and advice on attack.
your number 3 seems to be a little more advance where I am but I think I want to use this method. I see alot of potential here. although, the numerical ordering is nice, I think putting it in this form will make me practice more difficult hashes then, after, I can practice on order.
so... more questions will follow on this 3rd type of exercise...thanks all for the help and advice.
| [reply] |
|
$hash = ( uno => [ 1, 'uno' ],
due => [ 2, 'dos' ],
tre => [ 3, 'tres' ],
# and so on
);
Then after the second loop it would look like this:
$hash = ( uno => [ 1, 'uno','un' ],
due => [ 2, 'dos','due' ],
tre => [ 3, 'tres','tris' ],
# and so on
);
Then you'd need to learn how to sort the hash on the first element in each sub-array so that you can print them out in order. If you want to try that, then inside your first loop, you can get the line number to go with each key/value pair from the special $. variable.
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] [select] |
|
I appreciate all the advice and examples to manipulate hash, hash table, code improvement and will do all of these with your help.
should I create a new thread for each example or stick with this thread? maybe some other newbie can learn from it.
Ok, I've been playing with hash ref to get a 2 element hash.
hasn't worked yet. will you provide some instruction/teach/explain the error I did on hash ref?
error it gave me.
Can't use string ("dos") as an ARRAY ref while "strict refs" in use at C:\Users\Alberto\Documents\NetBeansProjects\PerlProject\Perl Essentials\hash_ref_6_4.pl line 34, <$in1> line 1.
use strict;
use warnings;
use Data::Dump qw(dump);
my %hash;
#my file handles UNTIL I figure how to install the Inline::File module
+ to netbeans IDE
open my $in, '<',"./test_data.txt" or die ("can't open the file:$!\n")
+;
open my $in1,'<',"./test_data1.txt" or die ("can't open file : $!\n");
open my $out ,'>' ,"./test_data_out.txt" or die "can't open the file f
+or write:$!\n";
open my $out1 ,'>',"./test_data_out1_no_match.txt" or die "can't open
+file for write:$!\n";
#creating hash
while (<$in>){
chomp;
my ($key,$value)= split(/\s*=\s*/); #conto di spazio prima o dopo
+la parola
$hash{$key}=$value;
}
close $in;
#using the first hash
while (<$in1>){
chomp;
my($key,$value)=split/\s*=\s*/ ;
#push the value to existing hash as to get reference if key exists
# %hash =( It => [Spa Fre])
#using one hash as per Ken code suggestion??
push @{$hash{$key}},$value if $hash{$key}; #non so come funzio
+na "push"
print $out dump (\%hash);
}
close $in1;
close $out;
close $out1;
| [reply] [d/l] |
|
(I'd suggest that you keep posting to this thread as long as you're working on the same problem, unless people stop responding to it.)
Ok, let's say you want to create a hash of two-element arrays, with the hash keys being the Italian numbers, each one pointing to a reference to a two-element array holding the Spanish and French numbers, in that order. Then as you're going through the first loop (Italian = Spanish), you need to insert the Spanish numbers as the first element of an array rather than a simple value:
$hash{$italian} = [ $spanish ];
The square brackets return a reference to an array, which is a scalar that can be stored as a value in the hash. So now it looks like this, with references to one-element arrays as the values:
$hash = ( uno => [ 'uno' ],
due => [ 'dos' ],
# and so on
);
Then in the second loop, you need to add the French numbers to the arrays corresponding to their matching Italian hash keys. There are two ways you could do this:
# by assigning directly to the second element of the sub-array
$hash{$italian}[1] = $french;
# or by dereferencing the sub-array pointed to by the hash value
# and pushing the new value onto the end of that array
push @{$hash{$italian}}, $french;
# Either way, you'll end up with:
$hash = ( uno => [ 'uno','un' ],
due => [ 'dos','deux' ],
# and so on
);
Then when you're ready to print them out, you loop through the keys of the hash, printing the key and the elements of the sub-array as you wish:
for my $key (keys %hash){
print $key, ' => ', join ' , ', @{$hash{$key}}; # dereference sub-ar
+ray
print "\n";
}
The trick is keeping track of what level of the structure you're dealing with, and getting the sigils (and arrows, if necessary) right for pointing to the right things, whether values or references.
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] [d/l] [select] |
|
|
|
Re: Need advice on checking two hashes values and keys
by FreeBeerReekingMonk (Deacon) on Jun 03, 2015 at 21:51 UTC
|
Uh... you iterate over the file while (<$in1>){ but inside that loop you iterate AGAIN? over foreach my $key (sort keys %hash){
That does not make much sense. A hash is made such that you do not need to iterate over them. You are treating the hash like it was an array.
while (<$in1>){
chomp;
my ($key,$value) = split (/\s*=\s/);
if (exists $hash{$key}){
print $out "$key => $hash{$key}, $value \n";
}else{
print $out1 "$key => $value \n";
}
}
| [reply] [d/l] [select] |
|
GRR, Yes, the iteration was an previous attempt to use all "IF THEN " checks on the Keys and values before abandoning it and deciding to ask for help so I can learn hashes by coding practice. I should have commented it out foreach loop.
your code did clear a bit on hashes for me though. I was wrongly thinking how "exists" worked. I wrongly infered that if key "exists" in both files,it would automatically keep the 1 key and ,regardless if there were many identical keys with other values, and append the new values to it. going to reread that section again. thanks.
will soon play with "define" as per the other code example...so much to learn on hashes.
I get this error but code didn't crashed...why? what does this mean?
Use of uninitialized value $value in concatenation (.) or string at C:\Users\Alberto\Documents\NetBeansProjects\PerlProject\Perl Essentials\giocando_con_il'ordinamento_hashes.pl line 39, <$in1> line 9.
| [reply] |
|
| [reply] |
Re: Need advice on checking two hashes values and keys
by GotToBTru (Prior) on Jun 03, 2015 at 21:52 UTC
|
You reuse the variable $key and I think it might be confusing things for you. The if in your foreach loop will always be true because $key comes from keys in %hash.
while (<$in1>) {
chomp;
my ($key1,$value1) = split (/\s*=\s*);
if defined($hash{$key1}) {
print "$key1 is in both!\n"
}
}
| [reply] [d/l] |
Re: Need advice on checking two hashes values and keys
by Laurent_R (Canon) on Jun 04, 2015 at 15:05 UTC
|
TIMTOWTDI, There is more than one way to do it.
Given the nature of the data (and depending how it is supposed to be used later), I would probably use an array of hashes (AoH), something like this:
my @numbers = ( undef,
{ it => "uno", sp => "uno", fr => "un"},
{ it => "due", sp => "dos", fr => "deux"},
{ it => "tre", sp => "tres", fr => "trois"},
# ...
);
which yields a structure like this:
0 ARRAY(0x6004f9c80)
0 undef
1 HASH(0x600636430)
'fr' => 'un'
'it' => 'uno'
'sp' => 'uno'
2 HASH(0x6005d18a8)
'fr' => 'deux'
'it' => 'due'
'sp' => 'dos'
3 HASH(0x6005d1920)
'fr' => 'trois'
'it' => 'tre'
'sp' => 'tres'
The advantage is that the array stays in order. Note that I created the first array element as undef, in order to have a natural correspondence between the element index and the numbers in he various languages (alternatively, I could have put a line for zero in all three languages). Each element of the array is a reference to a hash containing the number names in the various language.
To access to the Italian name of 2, simply try:
print $numbers[2]{it};
which should happily print "due".
| [reply] [d/l] [select] |
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1129003
use warnings;
use strict;
$_ = <<END;
uno = uno
due = dos
tre = tres
quattro = quatro
cinque = cinco
sei = seis
sette = siete
otto = ocho
nouve = nueve
dieci =diez
uno = un
due = due
tre = tris
quattro = quatre
cinque = cinq
sei = six
sette = sept
dieci = dix
END
print "$1 => $2, $3\n" while /^(\w+) = *(\w+)\b(?=.*\n\n.*^\1 = (\w+))
+/gms;
| [reply] [d/l] |
|
OH MIO DIO! this is a cool way to do it.
I need to learn ALL KIND of types hashes, hashrefs, hash table,..., for now since I get confused on them but using regex looks cool and will try this method too.
pretty cool.
| [reply] |
Re: Need advice on checking two hashes values and keys
by Random_Walk (Prior) on Jun 04, 2015 at 14:40 UTC
|
Here is one without the hash, using a table. It does make me think a database would be better for this application :-) but perhaps we should looks at the real application as being learning Perl...
/usr/bin/perl
use strict;
use warnings;
my @numbers = (
[qw( Italian Spanish French English Welsh )],
[qw( uno uno un one un )],
[qw( due dos deux two dau )],
[qw( tre tres trois three tri )],
[qw( quattro quatro quatre four pedwar )],
[qw( cinque cinco cinq five pump )],
[qw( sei seis six six chwech )],
[qw( sette siete sept seven saith )],
[qw( otto ocho huit eight wyth )],
[qw( nouve nueve neuf nine naw )],
[qw( dieci diez dix ten deg )],
);
print "Please enter number to translate\n";
my $num = <>;
while ($num) {
print "You typed $num\n";
chomp $num;
for my $row (@numbers) {
# dereference array, and look for out number in it
next unless grep {/$num/} @$row;
print "I found it: ";
print join " <-> ", @$row;
print "\n";
my $i = 0;
$i ++ until $row->[$i] eq $num;
print "It looks like it was in $numbers[0]->[$i]\n";
last;
}
print "Please enter another number to translate\n";
$num = <>;
}
The question this raises, is what happens when you type in uno? Altering it to know when a number has multiple matches, is left as an exercise for the reader.
Cheers, R.
Pereant, qui ante nos nostra dixerunt!
| [reply] [d/l] |
Re: Need advice on checking two hashes values and keys
by Random_Walk (Prior) on Jun 04, 2015 at 14:40 UTC
|
Here is one without the hash, using a table. It does make me think a database would be better for this application :-)
/usr/bin/perl
use strict;
use warnings;
my @numbers = (
[qw( Italian Spanish French )],
[qw( uno uno un )],
[qw( due dos deux )],
[qw( tre tres trois )],
[qw( quattro quatro quatre )],
[qw( cinque cinco cinq )],
[qw( sei seis six )],
[qw( sette siete sept )],
[qw( otto ocho huit )],
[qw( nouve nueve neuf )],
[qw( dieci diez dix )],
);
print "Please enter number to translate\n";
my $num = <>;
while ($num) {
print "You typed $num\n";
chomp $num;
for my $row (@numbers) {
# dereference array, and look for our number in it
next unless grep {/$num/} @$row;
print "I found it: ";
print join " <-> ", @$row;
print "\n";
my $i = 0;
$i ++ until $row->[$i] eq $num;
print "It looks like it was in $numbers[0]->[$i]\n";
last;
}
print "Please enter another number to translate\n";
$num = <>;
}
The question this raises, is what happens when you type in uno? Altering it to know when a number has multiple matches, is left as an exercise for the reader.
Cheers, R.
Pereant, qui ante nos nostra dixerunt!
| [reply] [d/l] |
|
| [reply] |
|
| [reply] |
|
|