Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Finding all substrings

by samtregar (Abbot)
on Apr 24, 2002 at 18:17 UTC ( [id://161710]=note: print w/replies, xml ) Need Help??


in reply to Finding all substrings

This solution depends on two factors, how many times does the code run and how much do the length of the strings vary? If the answers are "a lot" and "not much" then here's a faster solution:
{ my @cache; # a cache of substring-finding subs sub substrings { my $string = shift; my $length = length $string; # use cached sub if we have one return $cache[$length]->($string) if exists $cache[$length]; # create sub to find substrings for this length my $sub = 'sub { $_ = shift; return ('; foreach my $length (1..length($string)) { foreach my $offset (0..length($string)-$length) { $sub .= "substr(\$_,$offset,$length),"; } } $sub .= ")};"; $cache[$length] = eval $sub; # and use it return $cache[$length]->($string); } }
Using Benchmark.pm and a test case against random words from /usr/dict/words I found this to be 300% faster over 100,000 iterations. Not bad, but I bet we could do better!

-sam

PS: Has anyone noticed that <code> doesn't deal with hard tabs right? Copy-and-paste from Emacs is unpleasant.

Replies are listed 'Best First'.
Re: Re: Finding all substrings
by samtregar (Abbot) on Apr 24, 2002 at 19:13 UTC
    Slight improvement, now around 400% faster, at the expense of readability. I got rid of the parameter passing to the cached subs by using $_. Also, I unrolled the last iteration replacing a substr($string,0,length($string)) with just $string. I tried unrolling the first iteration into a split() but that was slower.
    { my @cache; sub substrings { $_ = shift; return &{$cache[length($_)]} if exists $cache[length($_)]; my $sub = 'sub { return ('; foreach my $len (1..length($_)-1) { foreach my $off (0..length($_)-$len) { $sub .= "substr(\$_,$off,$len),"; } } $sub .= "\$_)};"; $cache[length($_)] = eval $sub; return &{$cache[length($_)]}; } }
    The only trick I have left is Inline::C... I'll leave that as an exercise for the reader.

    -sam

Re: Finding all substrings
by Dominus (Parson) on Apr 24, 2002 at 19:34 UTC
    Says samtregar:
    I found this to be 300% faster over 100,000 iterations. Not bad, but I bet we could do better!
    Supposing that the original version took 10 seconds to run, then yours, 300% faster, completes and delivers the answer 20 seconds before you even enter the command that runs it. That is an impressive achievement indeed. I admire your optimization skills.

    Perhaps you meant that yours was three times as fast? If so, it was 66.67% faster, not 300%.

    Hope this helps.

    --
    Mark Dominus
    Perl Paraphernalia

      Um, just reading the Benchmark output man:
      Rate original new original 7710/s -- -80% new 38610/s 401% --
      If that doesn't mean 400% faster then I guess I need more education.

      -sam

        Benchmark is telling you:
        • new is 80% faster than original
        • original is 401% slower than new
        Don't confuse "this much more" with "this much less." :-)

        Russ
        Brainbench 'Most Valuable Professional' for Perl

      If it took 10 seconds, and was 300% faster, that would merely mean it was 3 seconds long. I think. Actually i think theyre both interchangeable

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://161710]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-04-25 22:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found