Using foreach to process a hash

iridius has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Using foreach to process a hash by imp (Priest) on Oct 22, 2006 at 04:10 UTC
You could use foreach to iterate over the keys or values, or you could use a while loop and each. `my %hash = ( a => 1, b => 2); while (my ($key,$val) = each %hash) { print "$key = $val\n"; } for my $key (keys %hash) { my $val = $hash{$key}; print "$key = $val\n"; }` [download]	[reply] [d/l]
Re^2: Using foreach to process a hash by ikegami (Patriarch) on Oct 22, 2006 at 21:05 UTC
Note: You can't safely use `last`, `return` or `die` to exist a `while each` loop. Not running `each` to the end makes `each` unusuable on that hash at a latter point. While less memory efficient, the `for` alternative avoids this problem.	[reply] [d/l] [select]
Re: Using foreach to process a hash by jdporter (Paladin) on Oct 22, 2006 at 04:20 UTC
`foreach ($value, $key = each %texthash) {` [download] Where in the documentation did you see an example like that? You seem to have some vague idea of how each works, but your syntax is so far off, I'd have to suppose that you haven't read the man page for each, because if you had, you'd have seen the following example: `while (($key, $value) = each %hash) {` [download] You got three things wrong: use while, not foreach the pair returned by each is `$key,$value` not `$value,$key` you need parentheses around `($key,$value)` We're building the house of the future together.	[reply] [d/l] [select]
Re^2: Using foreach to process a hash by blazar (Canon) on Oct 22, 2006 at 12:42 UTC
`while (($key, $value) = each %hash) {` [download] You got three things wrong: use while, not foreach the pair returned by each is `$key,$value` not `$value,$key` you need parentheses around `($key,$value)` But of course, if he wants to use foreach, then for completeness it's fair to remind that he can do so, but in connection with keys, not each: `foreach my $key (keys %hash) { my $value=$hash{$key};` [download] Update: hadn't noticed it already been covered by imp's reply. Hopefully repetita iuvant	[reply] [d/l] [select]
Re^3: Using foreach to process a hash by jdporter (Paladin) on Oct 22, 2006 at 13:27 UTC
Yes, but that had already been covered by imp's reply.	[reply]
Re: Using foreach to process a hash by DigitalKitty (Parson) on Oct 22, 2006 at 06:03 UTC
Hi all. Welcome to the monastery iridius. I took the liberty of including two sources of information you might find beneficial: DevShed Perl articles Beginning Perl The latter is a free online text (written for v5.6.1) that was designed to educate the beginning perl programmer. Feel free to download any chapter you'd like (or all of them). :) Concerning your question, jdporter answered it but I felt obligated to contribute. If you're processing a hash of considerable size, obviously you'd prefer to use the most efficient functions possible. In order to accomplish this end, the standard perl library includes a module called Benchmark (please see below for an example): `#!/usr/bin/perl use warnings; use strict; use Benchmark; timethese(1_000_000, { foreach_loop => \&foreach_loop, while_each_loop => \&while_each_loop } ); my %monks = ( jdporter => 'Prior', tye => 'Bishop', bobf => 'Vicar', planetscape => 'Vicar', belg4mit => 'Parson', ); sub foreach_loop { foreach my $key ( keys %monks ) { print "$key => $monks{$key}", "\n"; } } sub while_each_loop { while( my( $key, $value ) = each %monks ) { print "$key => $value", "\n"; } }` [download] Output: Benchmark: timing 1000000 iterations of foreach_loop, while_each_loop... foreach_loop: 0 wallclock secs ( 1.16 usr + 0.00 sys = 1.16 CPU) @ 864304.24/s (n=1000000) while_each_loop: 1 wallclock secs ( 0.51 usr + 0.00 sys = 0.51 CPU) @ 1941747.57/s (n=1000000) In order to ascertain which function is more efficient, look at the output and you'll see: sys =n.nn CPU. The Benchmark module functions by executing your code as many times as the first parameter after the timethese() subroutine indicates then calculates an average based upon the amount of time it took. It then reports on the total amount of time taken. As one can see, the while_each loop executes faster and that would be a wise choice when iterating over an exceptionally large hash. Update: In the event you see '(warning: too few iterations for a reliable count)' in the output, simply augment the number of code executions in the timethese() function. 2nd Update: Thanks jdporter and blazar. I rushed through this example and hadn't noticed the small bugs I had introduced by doing so. Hope this helps, ~Katie	[reply] [d/l]
Re^2: Using foreach to process a hash by jdporter (Paladin) on Oct 22, 2006 at 14:06 UTC
Unfortunately, DigitalKitty's benchmark suffers from a couple of major flaws. Primarily, the subs under test are doing prints. That means that timing is going to be swamped by I/O. The second thing is that the hash is so small, that in the ops performed by each sub, the ones we're trying to test (`each` and `keys`) occur very few times, relative to the overhead of calling the sub, etc. So I offer the following benchmark, which eliminates both of those sources of error. use Benchmark; my @words = do { local @ARGV = ( 'mondo_word_list.txt' ); <> }; my %w; @w{@words} = @words; @words = keys %w; print scalar(keys %w), " words\n"; timethese( 10, { foreach_loop => \&foreach_loop, foreach_loop_novar => \&foreach_loop_novar, while_each_loop => \&while_each_loop, array => \&array, }); sub foreach_loop { for my $key ( keys %w ) { $a = $key; $b = $w{$key}; } } sub foreach_loop_novar { for ( keys %w ) { $a = $_; $b = $w{$_}; } } sub while_each_loop { while( my( $key, $val ) = each %w ) { $a = $key; $b = $val; } } sub array { for ( @words ) { $a = $_; $b = $_; } } [download] Output: (slightly edited) `311142 words foreach_loop: 7 wallclock secs ( 6.22 usr foreach_loop_novar: 6 wallclock secs ( 6.17 usr while_each_loop: 5 wallclock secs ( 5.13 usr array: 1 wallclock secs ( 1.08 usr` [download] As you can see, for large hashes, `while each` wins over `for keys`. And you also gain a little by using the default iterator on the `for` loop. We're building the house of the future together.	[reply] [d/l] [select]
Re^2: Using foreach to process a hash by blazar (Canon) on Oct 22, 2006 at 13:30 UTC
Concerning your question, jdporter answered it but I felt obligated to contribute. If you're processing a hash of considerable size, obviously you'd prefer to use the most efficient functions possible. In order to accomplish this end, the standard perl library includes a module called Benchmark (please see below for an example): I beg to differ. I've done similar interventions before and I know the subject is controversial, but I don't mind being downvoted. Don't misunderstand me: Benchmark.pm is great and I use it quite often, i.e. whenever I really need it. Indeed had the OP called for 'efficiency', it may have been a prefectly sensible answer. But whenever one calls for 'efficiency' one bell should ring, and often does: the very question is whether efficiency would be relevant at all in the situation under consideration. Sometimes it is, sometimes it's not. Actually in the latter case it often turns out to be yet another case of obsession for premature optimization which, we all know, is the root all evil in programming. Now the point is, I see a risk in pointing a newbie like iridius towards these issues: precisely the risk of generating or contributing to that obsession for premature optimization... OTOH your code has a bug: C:\temp>dk Benchmark: timing 1000000 iterations of foreach_loop, while_each_loop. +.. foreach_loop: 0 wallclock secs ( 1.08 usr + 0.00 sys = 1.08 CPU) @ +927643.78/ s (n=1000000) while_each_loop: 0 wallclock secs ( 0.50 usr + 0.00 sys = 0.50 CPU) + @ 2000000 .00/s (n=1000000) C:\temp>perl -w dk.pl Name "main::hash" used only once: possible typo at dk.pl line 17. Benchmark: timing 1000000 iterations of foreach_loop, while_each_loop. +.. foreach_loop: 2 wallclock secs ( 1.08 usr + 0.00 sys = 1.08 CPU) @ +927643.78/ s (n=1000000) while_each_loop: 1 wallclock secs ( 0.52 usr + 0.00 sys = 0.52 CPU) + @ 1941747 .57/s (n=1000000) C:\temp>perl -wMstrict dk.pl Global symbol "%hash" requires explicit package name at dk.pl line 17. Execution of dk.pl aborted due to compilation errors. [download] Indeed you forgot to change `$hash{$key}` to `$monks{$key}` in code lifted from previous example, I suppose. In this case it doesn't alter excessively the results, but in others, a similar error may do, and in relevant manners. So I felt obligated to contribute too, but to the effect of reminding the OP that as the above clearly shows, just inserting `use strict; use warnings;` [download] at the top of quite about all of his scripts is probably the best way to avoid many common programming mistakes. It just tells perl to give one all the help it can to avoid them. One last piece of advice I can give him, also inspired by this example, is to use descriptive names for his variables: e.g. `$monk` instead of `$key` and `$rank` instead of `$value`. The rationale being that if it doesn't make sense when mentally translated into English, then chances are it may be wrong...	[reply] [d/l] [select]
Re: Using foreach to process a hash by planetscape (Chancellor) on Oct 22, 2006 at 13:01 UTC
You might also wish to have a look at Not Exactly a Hash Tutorial, or at any other of our very fine Tutorials. HTH, planetscape	[reply]
Re: Using foreach to process a hash by Melly (Chaplain) on Oct 22, 2006 at 09:52 UTC
"keys" will return the keys for a hash, and "values" will return all the values. For what you want to do, "keys" is the most useful - e.g. `use strict; my %hash = ('a'=>'hello', 'b'=>'goodbye', 'c'=>'foobar'); foreach my $key(keys %hash){ # iterates over 'a', 'b' and 'c' print "the value of $key is $hash{$key}\n"; }` [download] One point to note is that the order of the returned keys is not predictable (I get c,a,b in the above example). Tom Melly, tom@tomandlu.co.uk	[reply] [d/l]
Re: Using foreach to process a hash by lyklev (Pilgrim) on Oct 22, 2006 at 14:14 UTC
Are you coming from php? In Perl, variables are declared using `my`, and if you use the `use strict` pragma, all variables must be declared. Simply storing a variable in them does not declare them, like in php.	[reply] [d/l] [select]


Don't ask to ask, just ask
	PerlMonks

Using foreach to process a hash

Output: