http://qs321.pair.com?node_id=587072

All,
A typical dispatch table is a technique used to translate:
if ($item eq 'thing1') { # ... } elsif ($item eq 'thing2') { # ... } elsif ($item eq 'thing3') { # ... } elsif ($item eq 'thing4') { # ... } else { # ... }
Into: exists $dispatch{$item} ? $dispatch{$item}->() : $dispatch{default}->();

In addition to opportunities for code reduction and clean maintainable code, dispatch tables can be a runtime optimization. The if/eslif chain is a linear scan O(N) while a hash is purported to be O(1). The hash does have the overhead of looking up the key, dereferencing the value, and executing the code ref. I assumed that in what only take a few equality checks to pay for this overhead.

In this node, I claimed that my version of the code was much more efficient. That node inspired madbombX to convert an if/eslif chain into a dispatch table and benchmark it. Unfortunately, the bench was flawed and the results skewed. This lead me to write code to generate a fair benchmark based off user input.

#!/usr/bin/perl use strict; use warnings; use Getopt::Std; use HTML::Template; use List::Util; my %opt; get_args(\%opt); my $template = HTML::Template->new(filename => 'bench.tmpl'); $template->param(CHAIN_LENGTH => $opt{n}); $template->param(UNMATCHED_ITEMS => $opt{u}); my @chain = map {ITEM => $_}, 2 .. $opt{n}; $template->param(CHAIN => \@chain); my $unmatched_items = int(($opt{u} / 100) * $opt{t}); $unmatched_items ||= 1 if $opt{u}; my @input = (-1) x $unmatched_items; my $total = $opt{t} - $unmatched_items; my @weight = $opt{w} ? sort {$b <=> $a} grep $_, split /[%,]|\s+/, $opt{w} : (int(100/$opt{n})) x $opt{n}; for (1 .. $opt{n}) { my $amount = shift @weight; push @input, ($_) x (int(($amount/100) * $total) || 1); } $_ = {FIND => $_} for @input; @input = List::Util::shuffle(@input) if ! $opt{o}; $template->param(BUILD_FIND => \@input); print $template->output(); sub get_args { my $opt = shift @_; my $Usage = qq{Usage: $0 [options] -h : This help message -n : The (n)umber of items in the if/elsif chain Default: 5 -t : The (t)total number of items to process This number need not be the same as your actual data. It should be sufficient to exercise the options Default: -n * 10 -u : The percentage of (u)nmatched items to include Typical if/elsif chains end with a catch-all else This option allows you to determine how often this is use +d Default: 0 -o : Specifies if the if/elsif chain should be (o)rdered If the input data is well known, it is possible to order the if/elsif chains to ensure the most common items are f +irst Default: 0 (off) -w : The (w)eight in percentages of the matched input Ignored unless -o option also specified There must be as many #% as there are -n to work correctl +y The #% must add up to 100 to work correctly The -b option, if present, takes precedence before this o +ption For instance, if you have n = 5, t = 100, -o, and u = 10 -w"40%, 30%, 15%, 10%, 5%" would equate to 100% of the re +maining 90% Default: N/A } . "\n"; getopts('hn:t:u:w:o', $opt) or die $Usage; die $Usage if $opt->{h}; delete $opt->{w} if ! exists $opt->{o}; $opt->{n} = 5 if ! defined $opt->{n}; $opt->{t} = $opt->{n} * 10 if ! defined $opt->{t}; $opt->{u} = 0 if ! defined $opt->{u}; }
bench.tmpl
#!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; my %dispatch = map {$_ => sub {my $res = "I found it"}} 1 .. <TMPL_VAR + NAME="CHAIN_LENGTH">; <TMPL_IF NAME="UNMATCHED_ITEMS"> $dispatch{default} = sub {my $res = "I did not find it"}; </TMPL_IF> my @find; <TMPL_LOOP NAME=BUILD_FIND> push @find, <TMPL_VAR NAME=FIND>; </TMPL_LOOP> cmpthese(-10, { dispatch => sub { for (@find) { <TMPL_IF NAME="UNMATCHED_ITEMS"> if (exists $dispatch{$_}) { $dispatch{$_}->(); } else { $dispatch{default}->(); } <TMPL_ELSE> $dispatch{$_}->(); </TMPL_IF> } }, if_else => sub { for (@find) { if ($_ eq '1') { my $res = "I found it"; } <TMPL_LOOP NAME=CHAIN> elsif($_ eq '<TMPL_VAR NAME=ITEM>') { my $res = "I found it"; } </TMPL_LOOP> <TMPL_IF NAME="UNMATCHED_ITEMS"> else { my $res = "I did not find it"; } </TMPL_IF> } }, });

The results were quite amazing. No matter how I tweaked the settings, with only 6 items in the if/elsif chain I could not make the dispatch table win. It is still possible that the other optimizations I made to the code in that node made it more efficient but without a representative sample of the real input - I can't be sure.

The breakpoint without unmatched items is about 12 on my system. There are plenty of user defineable variables so perhaps you might want to question code you have that uses dispatch tables play around. While the code reduction and maintainability is still an advantage, the performance boost may not be all it is cracked up to be.

Update: Minor change to wording in last paragraph after realizing it may imply I was recommending not using dispatch tables because they were not as efficient as I assumed them to be originally.

See Also:

Cheers - L~R