Reminder to self: must use Memoize more often!

For functions which are non-volatile (i.e. the same inputs will always produce the same outputs) with no side-effects, it often makes sense to "memoize" them. That is, cache the results for the next time they're called.

This is an especially good idea in the case of recursive functions. The following benchmark script illustrates the massive speed up.

use strict;
use warnings;
use Memoize;
use Benchmark 'cmpthese';

sub fib1 {
    my $n = shift;
    return $n if $n < 2;
    return fib1($n-1) + fib1($n-2);
}

sub fib2 {
    my $n = shift;
    return $n if $n < 2;
    return fib2($n-1) + fib2($n-2);
}

memoize('fib2');

my %fib3_cache;
sub fib3 {
    my $n = shift;
    return $fib3_cache{$n} ||= do {
        $n < 2 ? $n : fib3($n-1) + fib3($n-2)
    };
}

for (0..5) {
    die "something bad" unless fib1($_)==fib2($_);
    die "something bad" unless fib1($_)==fib3($_);
}

cmpthese(-1, {
    fib1 => sub { fib1 20 },
    fib2 => sub { fib2 20 },
    fib3 => sub { fib3 20 },
});

__END__
         Rate     fib1     fib2     fib3
fib1   18.2/s       --    -100%    -100%
fib2  37594/s  206668%       --     -87%
fib3 289129/s 1590107%     669%       --
[download]

The Memoize module gives you a pretty decent speed up, without needing to change the guts of the function. Just add memoize('fib2') and the Memoize module will wrap the original function with memoization code.

The manually memoized version is clearly faster still, because it avoids the overhead of function wrapping - the memoization code is added to the guts of the function.

Of course, there's no such thing as a free lunch - you're trading CPU for memory.

perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Comment on Reminder to self: must use Memoize more often! Select or Download Code

Replies are listed 'Best First'.
Re: Reminder to self: must use Memoize more often! by BrowserUk (Patriarch) on Jan 12, 2013 at 13:55 UTC
`return $fib3_cache{$n} \|\|= do {` I'd strongly recommend trading `//=` for `\|\|=`. Any function that can return 0, but costs to determine that, looses out as is. You can get greater benefits still for functions that take low integers as inputs by using an array instead of a hash for the cache: `my @fib4_cache; sub fib4 { my $n = shift; return $fib4_cache[$n] //= do { $n < 2 ? $n : fib4($n-1) + fib4($n-2) }; } __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib1 88.7/s -- -100% -100% -100% fib2 219835/s 247838% -- -89% -91% fib3 2073455/s 2338415% 843% -- -19% fib4 2551114/s 2877136% 1060% 23% --` [download] And see the benefits more clearly if you do away with some of the benchmark overheads by loosing the sub wrappers (The module adds its own wrapper internally): `cmpthese(-1, { fib1 => q{ fib1 20 }, fib2 => q{ fib2 20 }, fib3 => q{ fib3 20 }, fib4 => q{ fib4 20 }, }); __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib1 87.4/s -- -100% -100% -100% fib2 223254/s 255377% -- -89% -91% fib3 2071504/s 2370384% 828% -- -20% fib4 2579867/s 2952119% 1056% 25% --` [download] And finally, there's a couple more % to be had: `my @fib5_cache; sub fib5 { my $n = shift; return $fib5_cache[ $n ] //= do { $n < 2 ? $n : ( $fib5_cache[ $n-1 ] //= fib5($n-1) ) + ( $fib5_cache[ $n-2 ] //= fib5($n-2) ) ## corrected; thanks +to moriitz }; } __END__ C:\test>memofib.pl Rate fib1 fib2 fib3 fib4 fib5 fib1 89.0/s -- -100% -100% -100% -100% fib2 218452/s 245352% -- -89% -91% -91% fib3 2022174/s 2272005% 826% -- -18% -20% fib4 2478824/s 2785096% 1035% 23% -- -2% fib5 2537045/s 2850512% 1061% 25% 2% --` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^2: Reminder to self: must use Memoize more often! by tobyink (Canon) on Jan 12, 2013 at 14:40 UTC
On my machine at least, `fib6` is 4% faster than `fib5`: `my @fib6_cache; sub fib6 { my $n = shift; $fib6_cache[ $n ] //= $n < 2 ? $n : ( $fib6_cache[ $n-1 ] //= fib6($n-1) ) + ( $fib6_cache[ $n-2 ] //= fib6($n-2) ); }` [download] `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply] [d/l] [select]
Re^3: Reminder to self: must use Memoize more often! by BrowserUk (Patriarch) on Jan 12, 2013 at 15:03 UTC
Fair enough. I guess we're into Benchmark vagaries. (T'was a bit unnecessary given the 1000% already gained :) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re: Reminder to self: must use Memoize more often! by flexvault (Monsignor) on Jan 12, 2013 at 17:34 UTC
Hello tobyink, I use 'Memoize' and it is an awesome product. But before I knew about it, I did some memorization using a hash. More than 10 years ago I wrote the following code for an editor that I converted from 'traditional C' to Perl: `sub ATS { # PyrEdit assumes the HOME is col 1, row 1 our %CRToutput; my $col = int(shift); my $row = int(shift); my $Saved = "$col\|$row"; if (exists $CRToutput{$Saved}) { return($CRToutput{$Saved}); } my $loc; if ( $SYS{"ATS"} eq "xterm" ) ## all 'xterm' compatibles { $loc = "\e\[$row\;$col"."H"; } else { ## PyrEdit assumes that HOME is col 1, row 1 if ( $SYS{"ATSDEC"} eq "Y" ) { $row--; $col--; } $loc = qx/tput cup $row $col/; chomp($loc); } $CRToutput{$Saved} = $loc; return($loc); }` [download] In nix the 'tput' command is very expensive, and since I had to support lots of different terminal types, I cached the results using a hash. The results were as good as the 'traditional C' after just a few seconds of use. Just an alternative. Update:* The original sub didn't have the 'my's. It was only after joining PM and learning of the value of 'use strict; use warnings' that I went back and put the 'my's into the code. I'm sure I've updated it in more than one place. It's currently on my mind since I received a package of xml files on Thursday without any "\n" and I added an '-xml' option the next day. After 2-3 hours of coding, I have a XML formatter that show the proper indention and structure for all(mine) XML files. "Well done is better than well said." - Benjamin Franklin	[reply] [d/l]
Re: Reminder to self: must use Memoize more often! by davido (Cardinal) on Jan 13, 2013 at 06:12 UTC
:) See also Re: Get me excited about perl. I fully agree. Memoization is a wonderful technique. The trick is recognizing when it's useful. Dave	[reply]

Back to Cool Uses for Perl