http://qs321.pair.com?node_id=554387


in reply to possible combinations in sequence

My basic idea is to map the array indices to bits in a binary number. If a bit is on, you take that element out of the source array. For example:
0 = 0b0000 --> {nothing} 3 = 0b0011 --> 'horse:cow' 13 = 0b1101 --> 'horse:dog:cat'
The algorithm then simply becomes a loop over 1 .. 2**@kw -1, testing the bits for each number.

Here's my first implementation of it. It's probably not as efficient as possible yet.

sub rhesa { # initial source in sequence order my $source = 'horse:cow:dog:cat'; my @kw = split /:/, $source; my @res; for my $i( 1 .. 2**@kw - 1 ) { my @ar; my $t; while( $i > 0 ) { push @ar, $kw[$t] if $i & 1; $i >>= 1; $t++; } push @res, join ':', @ar; } return @res; }
I'm a bit irritated with the number of temporary variables, but I can't think of anything prettier just now. Hope it helps :)

BTW, a simple Benchmark comparison showed a 200% speed increase over your version.

Replies are listed 'Best First'.
Re^2: possible combinations in sequence
by ikegami (Patriarch) on Jun 09, 2006 at 00:45 UTC

    Yours is also faster (albeit only 7% faster) than my grep approach, but my regexp approach is 13% faster than yours.

    Rate ikegami1 ruzam ikegami2 rhesa ikegami3 ikegami1 2381/s -- -19% -66% -68% -72% ruzam 2944/s 24% -- -58% -61% -65% ikegami2 7072/s 197% 140% -- -5% -16% rhesa 7478/s 214% 154% 6% -- -11% ikegami3 8420/s 254% 186% 19% 13% --
      You guys are so beyond awesome! ikegami3 is nothing short of brilliance :) ++ to ikegami. rhesa, and liverpole.
      Thanks to ikegami's benchmark, I ran my own benchmarks. I excluded ikegami1 simply because of the 'Algorithm::Loops' dependency. Then just for personal interest, I copied ikegami3 and replaced the '$parts - 1' parts:
      sub ikegami3x { local $_ = ":$_[0]:"; my $parts = tr/:/:/ - 2; # take 2 here instead of -1 later my $re = '(?{ "" })' . '(:[^:]*)(?=:)(?{ $^R . $^N })' . '(?:.*(:[^:]*)(?=:)(?{ $^R . $^N })' x $parts . ')?' x $parts . '(?{ push @rv, substr($^R, 1) })' . '(?!)'; { use re 'eval'; $re = qr/$re/; } local our @rv; /$re/; return @rv; }

      I also included rhesa2 with a slight change to eliminate 'uninitialized' warnings
      sub rhesa2 { my @kw = split /:/, $_[0]; map { my @ar; my $t = 0; # initialize $t do { ($_ & 1) and push @ar, $kw[$t]; $t++; } while ($_ >>= 1); join ':', @ar; } ( 1 .. 2**@kw - 1 ); }

      I evened up all test functions to use $_[0], and finally I ran tests against different 'word counts' of the source (in actual use, $source will contain varying numbers of words).
      These are my benchmark results (I've run this several times to come up with more or less the same results)
      source: horse:cow:dog:cat Rate ruzam ikegami3 ikegami2 ikegami3x rhesa2 rhesa ruzam 4620/s -- -61% -61% -61% -66% -66% ikegami3 11764/s 155% -- -0% -1% -12% -14% ikegami2 11819/s 156% 0% -- -1% -12% -13% ikegami3x 11935/s 158% 1% 1% -- -11% -13% rhesa2 13444/s 191% 14% 14% 13% -- -2% rhesa 13657/s 196% 16% 16% 14% 2% -- source: horse Rate ikegami3 ikegami3x ruzam ikegami2 rhesa rhesa2 ikegami3 40841/s -- -1% -37% -62% -64% -72% ikegami3x 41226/s 1% -- -37% -62% -64% -72% ruzam 65317/s 60% 58% -- -39% -43% -55% ikegami2 107178/s 162% 160% 64% -- -6% -26% rhesa 114470/s 180% 178% 75% 7% -- -21% rhesa2 145232/s 256% 252% 122% 36% 27% -- source: horse:cat Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2 ruzam 26853/s -- -2% -7% -48% -53% -58% ikegami3 27324/s 2% -- -5% -47% -52% -58% ikegami3x 28732/s 7% 5% -- -45% -50% -55% ikegami2 51965/s 94% 90% 81% -- -9% -19% rhesa 57233/s 113% 109% 99% 10% -- -11% rhesa2 64472/s 140% 136% 124% 24% 13% -- source: horse:cow:cat Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2 ruzam 10772/s -- -41% -42% -58% -61% -64% ikegami3 18305/s 70% -- -1% -28% -33% -38% ikegami3x 18436/s 71% 1% -- -27% -33% -38% ikegami2 25353/s 135% 39% 38% -- -7% -15% rhesa 27363/s 154% 49% 48% 8% -- -8% rhesa2 29753/s 176% 63% 61% 17% 9% -- source: horse:cow:dog:cat:mouse Rate ruzam rhesa ikegami2 rhesa2 ikegami3x ikegami3 ruzam 1632/s -- -67% -68% -73% -75% -75% rhesa 5021/s 208% -- -3% -17% -24% -24% ikegami2 5159/s 216% 3% -- -14% -22% -22% rhesa2 6023/s 269% 20% 17% -- -9% -9% ikegami3x 6614/s 305% 32% 28% 10% -- -0% ikegami3 6634/s 307% 32% 29% 10% 0% --

      I can't nail down the box so the results can fluctuate quite a bit from test to test, but overall these seem to be consistent. rhesa2 takes the lead up to 4 words, ikegami3 takes over at 5 words (and even more so at 6 words). rhesa2 rocks in the low word counts, where as ikegami3 seems to have more overhead. In my real world use, the word count is usually 4 or less (4 was just a good example size), so rhesa2 wins and replaceses my original ruzam.

        The name of the variable doesn't match the value it contains in
        my $parts = tr/:/:/ - 2; # take 2 here instead of -1 later
        and it doesn't give you anything. That was a step in the wrong direction.

      Your use of grep is adorable! I think it's by far the most readable version, with the clearest exposition of intent.

      The regexp on the other hand... the approach certainly is neat, i'll give you that ;) ++ for speed and ingenuity, but ouch does my head spin!

      ikegami,
      I was tired last night when I found this thread but I wanted to point out Finding all Combinations. I would be interested in seeing how the ones that produce the correct order compare (specifically mine).

      Cheers - L~R

Re^2: possible combinations in sequence
by liverpole (Monsignor) on Jun 09, 2006 at 00:45 UTC
    rhesa, I like your version.

    Here's my somewhat golfed variation:

    sub rhesa2 { my $source = shift; my @kw = split /:/, $source; map { my (@ar, $t); do { ($_ & 1) and push @ar, $kw[$t]; $t++; } while ($_ >>= 1); join ':', @ar } ( 1 .. 2**@kw - 1 ) } my @res = rhesa2('horse:cow:dog:cat');

    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re^2: possible combinations in sequence
by grinder (Bishop) on Jun 09, 2006 at 05:46 UTC
    My basic idea is to map the array indices to bits in a binary number. If a bit is on, you take that element out of the source array.

    This is precisely the same approach that I used in Data::PowerSet, for indeed, what the OP is looking for is the power set of the list.

    • another intruder with the mooring in the heart of the Perl