http://qs321.pair.com?node_id=554398


in reply to Re: possible combinations in sequence
in thread possible combinations in sequence

Yours is also faster (albeit only 7% faster) than my grep approach, but my regexp approach is 13% faster than yours.

use strict; use warnings; use Algorithm::Loops qw( NestedLoops ); use Benchmark qw( cmpthese ); # Note: I improved the speed and the interface of ruzam's solution. sub _ruzam { my ($source, $hash) = @_; return if $hash->{$source}; my @parts = split(/:/, $source); return if @parts < 1; ++$hash->{$source}; for my $i (0..$#parts) { my @parts_copy = @parts; splice(@parts_copy, $i, 1); _ruzam(join(':', @parts_copy), $hash); } } sub ruzam { my ($source) = @_; my %hash; _ruzam($source, \%hash); return keys %hash; } sub ikegami1 { my ($source) = @_; my @parts = split(/:/, $source); return NestedLoops( [ [ 0..$#parts ], ( sub { [ $_+1..$#parts ] } ) x $#parts, ], { OnlyWhen => 1 }, sub { join(':', map $parts[$_], @_) }, ); } sub ikegami2 { my ($source) = @_; my @parts = split(/:/, $source); my @rv; for my $comb (1..2**@parts-1) { push @rv, join ':', map $parts[$_], grep $comb & (1<<$_), 0..$#parts; } return @rv; } sub ikegami3 { local $_ = ":$_[0]:"; my $parts = tr/:/:/ - 1; my $re = '(?{ "" })' . '(:[^:]*)(?=:)(?{ $^R . $^N })' . '(?:.*(:[^:]*)(?=:)(?{ $^R . $^N })' x ($parts-1) . ')?' x ($parts-1) . '(?{ push @rv, substr($^R, 1) })' . '(?!)'; { use re 'eval'; $re = qr/$re/; } local our @rv; /$re/; return @rv; } sub rhesa { my ($source) = @_; my @parts = split /:/, $source; my @rv; for my $i( 1 .. 2**@parts - 1 ) { my @ar; my $t = 0; while( $i > 0 ) { push @ar, $parts[$t] if $i & 1; $i >>= 1; $t++; } push @rv, join ':', @ar; } return @rv; } { local our $source = 'horse:cow:dog:cat'; my $expected = 'cat|cow|cow:cat|cow:dog|cow:dog:cat|dog|dog:cat|hor +se|horse:cat|horse:cow|horse:cow:cat|horse:cow:dog|horse:cow:dog:cat| +horse:dog|horse:dog:cat'; foreach (qw( ruzam ikegami1 ikegami2 ikegami3 rhesa )) { printf("%-9s ", "$_:"); my $rv = join '|', sort do { no strict 'refs'; \&{$_} }->($source); if ($rv eq $expected) { print("ok"); } else { print("bad ($rv)"); } print("\n"); } print("\n"); cmpthese(-3, { # ruzam => q{ use strict; use warnings; my @rv = ruzam our + $source; 1 }, # ikegami1 => q{ use strict; use warnings; my @rv = ikegami1 our + $source; 1 }, # ikegami2 => q{ use strict; use warnings; my @rv = ikegami2 our + $source; 1 }, # ikegami3 => q{ use strict; use warnings; my @rv = ikegami3 our + $source; 1 }, # rhesa => q{ use strict; use warnings; my @rv = rhesa our + $source; 1 }, ruzam => q{ my @rv = ruzam $source; 1 }, ikegami1 => q{ my @rv = ikegami1 $source; 1 }, ikegami2 => q{ my @rv = ikegami2 $source; 1 }, ikegami3 => q{ my @rv = ikegami3 $source; 1 }, rhesa => q{ my @rv = rhesa $source; 1 }, }); }
Rate ikegami1 ruzam ikegami2 rhesa ikegami3 ikegami1 2381/s -- -19% -66% -68% -72% ruzam 2944/s 24% -- -58% -61% -65% ikegami2 7072/s 197% 140% -- -5% -16% rhesa 7478/s 214% 154% 6% -- -11% ikegami3 8420/s 254% 186% 19% 13% --

Replies are listed 'Best First'.
Re^3: possible combinations in sequence
by ruzam (Curate) on Jun 09, 2006 at 04:01 UTC
    You guys are so beyond awesome! ikegami3 is nothing short of brilliance :) ++ to ikegami. rhesa, and liverpole.
    Thanks to ikegami's benchmark, I ran my own benchmarks. I excluded ikegami1 simply because of the 'Algorithm::Loops' dependency. Then just for personal interest, I copied ikegami3 and replaced the '$parts - 1' parts:
    sub ikegami3x { local $_ = ":$_[0]:"; my $parts = tr/:/:/ - 2; # take 2 here instead of -1 later my $re = '(?{ "" })' . '(:[^:]*)(?=:)(?{ $^R . $^N })' . '(?:.*(:[^:]*)(?=:)(?{ $^R . $^N })' x $parts . ')?' x $parts . '(?{ push @rv, substr($^R, 1) })' . '(?!)'; { use re 'eval'; $re = qr/$re/; } local our @rv; /$re/; return @rv; }

    I also included rhesa2 with a slight change to eliminate 'uninitialized' warnings
    sub rhesa2 { my @kw = split /:/, $_[0]; map { my @ar; my $t = 0; # initialize $t do { ($_ & 1) and push @ar, $kw[$t]; $t++; } while ($_ >>= 1); join ':', @ar; } ( 1 .. 2**@kw - 1 ); }

    I evened up all test functions to use $_[0], and finally I ran tests against different 'word counts' of the source (in actual use, $source will contain varying numbers of words).
    These are my benchmark results (I've run this several times to come up with more or less the same results)
    source: horse:cow:dog:cat Rate ruzam ikegami3 ikegami2 ikegami3x rhesa2 rhesa ruzam 4620/s -- -61% -61% -61% -66% -66% ikegami3 11764/s 155% -- -0% -1% -12% -14% ikegami2 11819/s 156% 0% -- -1% -12% -13% ikegami3x 11935/s 158% 1% 1% -- -11% -13% rhesa2 13444/s 191% 14% 14% 13% -- -2% rhesa 13657/s 196% 16% 16% 14% 2% -- source: horse Rate ikegami3 ikegami3x ruzam ikegami2 rhesa rhesa2 ikegami3 40841/s -- -1% -37% -62% -64% -72% ikegami3x 41226/s 1% -- -37% -62% -64% -72% ruzam 65317/s 60% 58% -- -39% -43% -55% ikegami2 107178/s 162% 160% 64% -- -6% -26% rhesa 114470/s 180% 178% 75% 7% -- -21% rhesa2 145232/s 256% 252% 122% 36% 27% -- source: horse:cat Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2 ruzam 26853/s -- -2% -7% -48% -53% -58% ikegami3 27324/s 2% -- -5% -47% -52% -58% ikegami3x 28732/s 7% 5% -- -45% -50% -55% ikegami2 51965/s 94% 90% 81% -- -9% -19% rhesa 57233/s 113% 109% 99% 10% -- -11% rhesa2 64472/s 140% 136% 124% 24% 13% -- source: horse:cow:cat Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2 ruzam 10772/s -- -41% -42% -58% -61% -64% ikegami3 18305/s 70% -- -1% -28% -33% -38% ikegami3x 18436/s 71% 1% -- -27% -33% -38% ikegami2 25353/s 135% 39% 38% -- -7% -15% rhesa 27363/s 154% 49% 48% 8% -- -8% rhesa2 29753/s 176% 63% 61% 17% 9% -- source: horse:cow:dog:cat:mouse Rate ruzam rhesa ikegami2 rhesa2 ikegami3x ikegami3 ruzam 1632/s -- -67% -68% -73% -75% -75% rhesa 5021/s 208% -- -3% -17% -24% -24% ikegami2 5159/s 216% 3% -- -14% -22% -22% rhesa2 6023/s 269% 20% 17% -- -9% -9% ikegami3x 6614/s 305% 32% 28% 10% -- -0% ikegami3 6634/s 307% 32% 29% 10% 0% --

    I can't nail down the box so the results can fluctuate quite a bit from test to test, but overall these seem to be consistent. rhesa2 takes the lead up to 4 words, ikegami3 takes over at 5 words (and even more so at 6 words). rhesa2 rocks in the low word counts, where as ikegami3 seems to have more overhead. In my real world use, the word count is usually 4 or less (4 was just a good example size), so rhesa2 wins and replaceses my original ruzam.

      The name of the variable doesn't match the value it contains in
      my $parts = tr/:/:/ - 2; # take 2 here instead of -1 later
      and it doesn't give you anything. That was a step in the wrong direction.

Re^3: possible combinations in sequence
by rhesa (Vicar) on Jun 09, 2006 at 02:27 UTC
    Your use of grep is adorable! I think it's by far the most readable version, with the clearest exposition of intent.

    The regexp on the other hand... the approach certainly is neat, i'll give you that ;) ++ for speed and ingenuity, but ouch does my head spin!

Re^3: possible combinations in sequence
by Limbic~Region (Chancellor) on Jun 09, 2006 at 12:49 UTC
    ikegami,
    I was tired last night when I found this thread but I wanted to point out Finding all Combinations. I would be interested in seeing how the ones that produce the correct order compare (specifically mine).

    Cheers - L~R