map {} list or do {} for list?

Random_Walk has asked for the wisdom of the Perl Monks concerning the following question:

Which construct is preferred, and why?

# I am looking for hosts that appear in all of a set of SQL tables. 
# I keep a table count and increment the count for a host for each tab
+le in 
# which it occurs.
# Then I clean out those that were not in all tables.

# Faked data
my %hosts = (
    a => 3,
    b => 3,
    c => 4,
    d => 3,
    e => 2,
    f => 3,
    g => 3,
);
my $table_count = 3;

# This:
do {delete $hosts{$_} unless $hosts{$_} == $table_count} for keys %hos
+ts;

# or This:
map {delete $hosts{$_} unless $hosts{$_} == $table_count} keys %hosts;
[download]

Or is there a better way to do it?

Cheers,
R.

Pereant, qui ante nos nostra dixerunt!

Comment on map {} list or do {} for list? Download Code

Replies are listed 'Best First'.
Re: map {} list or do {} for list? by LanX (Saint) on Apr 11, 2014 at 14:29 UTC
TIMTOWTDI, both are fine. IIRC old versions of Perl had limitations with `map` in void context, but that's history. Some might argue that this is more readable `for my $key ( keys %hosts ) { if ( $hosts{$key} != $table_count ) { delete $hosts{$key}; } }` [download] Cheers Rolf ( addicted to the Perl Programming Language) update inverted `unless` to `if` update this works `while ( my ($key,$value) = each %hosts) { delete $hosts{$key} if $value != $table_count; }` [download] but I'm not sure about side effects! edit aha -> each If you add or delete elements of a hash while you’re iterating over it, you may get entries skipped or duplicated, so don’t. Exception: It is always safe to delete the item most recently returned by "each()", which means that the following code will work: `while (($key, $value) = each %hash) { print $key, "\n"; delete $hash{$key}; # This is safe }` [download]	[reply] [d/l] [select]
Re^2: map {} list or do {} for list? by McA (Priest) on Apr 11, 2014 at 14:58 UTC
Have you seen this post concerning `each`? http://blogs.perl.org/users/rurban/2014/04/do-not-use-each.html McA	[reply] [d/l]
Re^3: map {} list or do {} for list? by LanX (Saint) on Apr 11, 2014 at 15:14 UTC
No I didn't¹ .. But problem ... a) seems to be introduced by the new randomization, so it's a bug and b) is not new, you can't nest `each %hash` cause it has global side-effects². In my case it's only local to the loop. But yeah, I would love to have something like `hashgrep` and `hashmap` in core ... Cheers Rolf ( addicted to the Perl Programming Language) ¹) maybe I should switch spending time from PM to blogs.perl.org to follow Reini, Aristoteles and Damian... ²) maybe better phrased as "bound to the the hash ref". I can imagine situations where passing around the hashref and iterating over it in different places is useful.	[reply] [d/l] [select]
Re^4: map {} list or do {} for list? by choroba (Cardinal) on Apr 11, 2014 at 15:17 UTC
Re^5: map {} list or do {} for list? by LanX (Saint) on Apr 11, 2014 at 15:20 UTC
Re^2: map {} list or do {} for list? by McA (Priest) on Apr 11, 2014 at 14:35 UTC
Hi Rolf, just wanted to post this snippet: `for my $host (keys %hosts) { next if $hosts{$host} == $table_count; delete $hosts{$host}; }` [download] but checked the incomming answers before posting redundant things. In this case I had to smile because I do agree with your "taste" (++). UPDATE: Had an logic error in there: changed < to ==. McA	[reply] [d/l]
Re: map {} list or do {} for list? by hdb (Monsignor) on Apr 11, 2014 at 14:50 UTC
Or with a slice: `delete @hosts{ grep { $hosts{$_} != $table_count } keys %hosts };` [download]	[reply] [d/l]
Re: map {} list or do {} for list? by SuicideJunkie (Vicar) on Apr 11, 2014 at 14:30 UTC
`map` returns a list. It should be used when you want the list being generated as output. `do{...} for` should be used when you just want to do things in a loop. Since you have your statement in void context, you obviously don't want any output. Thus, the for loop is what you want.	[reply] [d/l] [select]
Re^2: map {} list or do {} for list? by Random_Walk (Prior) on Apr 11, 2014 at 15:27 UTC
That was my feeling about map too. But I have seen it used void in a few bits of code recently, and wondered if there was any solid argument for or against, other than it looks wrong :) Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply]
Re^3: map {} list or do {} for list? by SuicideJunkie (Vicar) on Apr 11, 2014 at 16:06 UTC
Well, it certainly still works, but its misleading. You're building a big list of results and then immediately throwing it away.	[reply]
Re^4: map {} list or do {} for list? by choroba (Cardinal) on Apr 11, 2014 at 16:08 UTC
Re^5: map {} list or do {} for list? by Anonymous Monk on Apr 11, 2014 at 16:44 UTC
Re^5: map {} list or do {} for list? by SuicideJunkie (Vicar) on Apr 14, 2014 at 14:30 UTC
Some notes below your chosen depth have not been shown here
Re^5: map {} list or do {} for list? by Random_Walk (Prior) on Apr 12, 2014 at 12:33 UTC
Re: map {} list or do {} for list? by choroba (Cardinal) on Apr 11, 2014 at 14:35 UTC
Just to add to TIMTOWTDI: `$hosts{$_} == $table_count or delete $hosts{$_} for keys %hosts;` [download] Update: Fixed, used `and` instead of `or`. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l] [select]
Re^2: map {} list or do {} for list? by hdb (Monsignor) on Apr 12, 2014 at 08:40 UTC
This deletes the entries the OP wanted to keep...	[reply]
Re^3: map {} list or do {} for list? by choroba (Cardinal) on Apr 12, 2014 at 08:59 UTC
Thanks, fixed. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply]
Re: map {} list or do {} for list? by tobyink (Canon) on Apr 11, 2014 at 21:11 UTC
If I wanted it to be fast and concise, I'd probably go with: `$hosts{$_}==$table_count or delete($hosts{$_}) for keys %hosts;` [download] `use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name`	[reply] [d/l]
Re: map {} list or do {} for list? by jethro (Monsignor) on Apr 11, 2014 at 14:30 UTC
You know the perl motto? "There is more than one way to do it"	[reply]
Re: map {} list or do {} for list? by Laurent_R (Canon) on Apr 11, 2014 at 17:41 UTC
Or copying the hash onto itself, so to speak: `%hosts = map { $hosts{$_} == $table_count ? ($_, $hosts{$_}) : ()} key +s %hosts;` [download]	[reply] [d/l]
Re: map {} list or do {} for list? - Benchmarks by Random_Walk (Prior) on Apr 12, 2014 at 12:32 UTC
Thanks to all who contrbuted to this thread. I have put together a quick benchmark. Unless I made an error in my code, map is the clear winner on performance grounds, as well as confusing any newbie that has to look at my code grounds ;-) `Rate or delete do copy slice map or delete 1620746/s -- -1% -50% -71% -78% do 1642036/s 1% -- -49% -70% -78% copy 3215434/s 98% 96% -- -42% -56% slice 5555556/s 243% 238% 73% -- -24% map 7299270/s 350% 345% 127% 31% --` [download] Here is the code: #!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); # create a hash with a few values deviating my $t_count = 3; my %hosts = map { $_ => $t_count - ( rand 50 > 49 ? 1 : 0 ) } (1 .. 1 +000); my $count = 10_000_000; cmpthese($count, { ' do' => ' do {delete $hosts{$_} unless $hosts{$_} == $t_co +unt} for keys %hosts ', ' map' => ' map {delete $hosts{$_} unless $hosts{$_} == $t_co +unt} keys %hosts ', 'or delete' => ' $hosts{$_} == $t_count or delete $hosts{$_} for k +eys %hosts ', ' slice' => ' delete @hosts{ grep { $hosts{$_} != $t_count } ke +ys %hosts } ', ' copy' => ' %hosts = map { $hosts{$_} == $t_count ? ($_, $hos +ts{$_}) : ()} keys %hosts ', }); [download] Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply] [d/l] [select]
Re^2: map {} list or do {} for list? - Benchmarks by davido (Cardinal) on Apr 12, 2014 at 17:53 UTC
When you see operations per second that high, you should ask yourself if any work is being done. In fact, none really is, and your benchmark is rendered totally unreliable as a result. First, you have scoping issues. And even if you fix those, you remain with the issue that choroba identified; that the first benchmark iteration deletes from the master copy of the hash, leaving remaining iterations with less work to do. Here's a version that codes around the scoping issues that evaled code creates, and that makes a copy of %hosts on each iteration. That copy costs time, but it costs the same amount of time for each snippet. use Benchmark qw(cmpthese); our $t_count = 3; our %hosts = map { $_ => $t_count - ( rand 50 > 49 ? 1 : 0 ) } (1 .. +1000); my $count = 10000; cmpthese($count, { do => 'my %t = %main::hosts; do { delete $t{$_} unless $t{$_} == +$main::t_count } for keys %t;', map => 'my %t = %main::hosts; map { delete $t{$_} unless $t{$_} == + $main::t_count } keys %t;', or => 'my %t = %main::hosts; $t{$_} == $main::t_count or delete $ +t{$_} for keys %t;', slice => 'my %t = %main::hosts; delete @t{ grep { $t{$_} != $main::t +_count } keys %t };', copy => 'my %t = %main::hosts; %t = map { $t{$_} == $main::t_count +? ($_, $t{$_}) : ()} keys %t;', nada => 'my %t = %main::hosts;' }); [download] And here's the output I get: `Rate copy map or do slice nada copy 1372/s -- -57% -57% -57% -58% -78% map 3175/s 131% -- -2% -2% -2% -50% or 3226/s 135% 2% -- 0% -0% -49% do 3226/s 135% 2% 0% -- -0% -49% slice 3236/s 136% 2% 0% 0% -- -49% nada 6289/s 358% 98% 95% 95% 94% --` [download] "nada" is just there to identify how much time we're wasting making a fresh copy of the hash on each iteration. As you can see, all of the approaches except for the copy one are so close that they're probably within the margin of error. Use the one that seems most legible, and if there's a risk that it won't be comprehended, encapsulate by wrapping it in a well-named subroutine. Dave	[reply] [d/l] [select]
Re^2: map {} list or do {} for list? - Benchmarks by choroba (Cardinal) on Apr 12, 2014 at 17:03 UTC
After running the first benchmarked subroutine, your %hosts hash gets smaller. The deleting never happens again. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply]

Back to Seekers of Perl Wisdom

update

update

edit