http://qs321.pair.com?node_id=79050
Category: Mathematics
Author/Contact Info Alexander (Lexicon) Scouras
http://code.anapraxis.net
Description: Math::Combinatorics::Combinator
Version 0.90 (Release Candidate A)
Indexes and Generates R-Combinations

Given a @Master, an $R (subset size) and a $Combindex (an index), Combinator will calculate which @Combination that is (in Lexicographical Order) and return the subset.

Given a @Master and a @Combination, Combinator can reverse itself and calculate the $Combindex.

All this and much, much more is in the pod. This is an Object Oriented Module. However, the proceedural version is about twice as fast, and available on the web site. I am currently trying to figure out how to combine them into one module w/o loss of functionality, loss of speed, or extreme confutsion to the user, and any tips would be appreciated. I will produce the Permutator version of this file in the next couple months.
#!/usr/local/bin/perl -w

package Math::Combinatorics::Combinator;

####################################################################
#
# Math::Combinatorics::Combinator
# Version 0.90    22 April 2001
# Copyright 2001, Alexander Scouras
# All Rights Reserved
# Bug reports or comments may be sent to lexicon@anapraxis.net
#
# This program is free software.
# It may be distributed and/or modified under either the
# Perl Artistic License or the GNU General Public License.
#
####################################################################
#
# This is a combinatorics module for Perl.  I assume anyone digging
# around in here knows their Combinatorics.  This is going to be a
# quick couple of paragraphs describing how this code 'thinks about'
# a combination in terms of it's data structure.  One might breeze
# through the POD before reading this for a refresher.
#
# Each $Element in the @Combination has a certain range.  The size
# of this range is $Holes, as described in the POD.  Everything
# describing an $Element's Index is a reference to which of its own
# $Holes it occupies in the @Combination.  @Combination is usually
# stored as a collection of $Holes indexes.  At the end of
# &Combinate, the actual @Combination is created by adding an
# $Element's $Index to the rank of the $Element itself (0th through
# R-1'th elements), and using this as an index into the @Master to
# find the value. The input (to Decombinate) and output (from
# Combinate) of actual elements is in the @Combination array.
#
# So how does it determine which $Elem ends up in which of its
# $Holes?  The cool stuff is in RANDOM &Combinate and &Decombinate.
# In RANDOM, all the majic happens in the @Steps_Incr / @Steps_Zero
# population.  There we determine how many @Steps it takes to place
# an $Elem into a given $Index.  Basically, we brute force calculate
# it by running through all the posibilities of Lexicographic
# Ordering.  @Steps_Incr takes this a step at a time, then
# @Steps_Zero gives us some shortcuts for later calculations.  All
# this happens during Initialize.
#
# Next we will discuss the actual RANDOM &Combinate process only.
# For &Decombinate, you do all this in reverse (with a couple
# extra steps discussed there).  Starting with the lower #Elem's,
# we try to allocate off $Steps from our $Combindex.  For each
# $Elem, we test (using @Steps_Zero) how high it's index can reach,
# pushing it as high as $Combindex will let us.  Finding the max, we
# allocate that many $Steps to that $Elem and move to the next.  We
# then add the previous $Elem's $Index to the current one to
# maintain order and avoid repeating previous @Combinatinos.
# Repeat the process till we get to the second to last $Elem.  If
# there are any unallocated $Steps, we add them to the last $Elem,
# as well as the 2nd to last's $Index, of course.
#
# The LINEAR algorithm is discussed where it is implimented, and is
# easy to understand simply by examining the code (once one
# understands this $Holes concept anyway).
#
####################################################################

# Constants for the Object Properties Array

my $C = 0; # Constant Count
use constant MASTER         => $C++;  # Source Array
use constant N              => $C++;  # Size of Source Array
use constant R              => $C++;  # Size of Sub-Set
use constant HOLES          => $C++;  # Possible values for an Elem
use constant COMBINDEX      => $C++;  # Index of the Combination
use constant ABSTRACT       => $C++;  # Array with Sub-Set,
use constant MIN_COMBINDEX  => $C++;  # Minimum Index, always 0
use constant MAX_COMBINDEX  => $C++;  # Maximum Index.Choose(N,R)
use constant STEPS_INCR     => $C++;  # Steps to Index from 0
use constant STEPS_ZERO     => $C++;  # Steps to Index from previous

use strict;
use Math::Combinatorics qw(:common);

$Math::Combinatorics::Combinator::VERSION
            = 0.90;   # v1.00 Release Candidate.



####################################################################
#   INITIALIZE COMBINATOR (
#                $Return_Size # Size of the set to return
#                @Master,     # The array that will be operated on
#              )
####################################################################
# Constructor which Initializes a $Combinator for producing
# R-Combinations from a given @Master.
####################################################################

sub Initialize {
  my $s               = [];
  bless $s;

  $s->[R            ] = shift;
  $s->[MASTER       ] = shift;
  $s->[N            ] = scalar @{$s->[MASTER]};
  $s->[HOLES        ] = $s->[N] - $s->[R] + 1;
  $s->[COMBINDEX    ] = -2;
  $s->[ABSTRACT     ] = [];
  $s->[MIN_COMBINDEX] = 0;
  $s->[MAX_COMBINDEX] = Choose($s->[N], $s->[R]) - 1;

  $s->Init_Steps_Incr();
  $s->Init_Steps_Zero();

  return $s;

}

####################################################################
#   @STEPS (INCREMENTALY) [ ELEMENT ] [ INDEX ]
####################################################################
# This is a 2 dimensional array of the steps between INDEX-1 and
# INDEX at the same HEIGHT.
####################################################################

sub Init_Steps_Incr {
  my $s           = shift;
  my $R           = $s->[R];
  my $Holes       = $s->[HOLES];
  my @Steps_Incr;

  for (my $e = $R-1; $e >= 0; $e--) {
    $Steps_Incr[$e] = [];         # Create subarray
    # 0th Index: Default location, takes no steps;
    $Steps_Incr[$e][0] = 0;
    for my $i (1..$Holes-1) {
      # Last Element: moves one step at a time
      if    ($e == $R-1) { $Steps_Incr[$e][$i] = 1 }
      # First Index: the Sum of the Steps of Prev Element + 1;
      elsif ($i == 1) {
        $Steps_Incr[$e][$i] = 1;
        for my $x ($i..$Holes-1) {
          $Steps_Incr[$e][$i] += $Steps_Incr[$e+1][$x] }}
      # Normally: Steps to Prev Index
      #         - Steps to Prev Height's Prev Index
      else {
        $Steps_Incr[$e][$i] =
          $Steps_Incr[$e  ][$i-1] -
          $Steps_Incr[$e+1][$i-1]}
    }
  }

  $s->[STEPS_INCR] = \@Steps_Incr;

}

####################################################################
#   @STEPS (FROM ZERO) [ HEIGHT ] [ INDEX ]
####################################################################
# This is an array of the number of steps necessary to place an
# element at HEIGHT in Return_Size to a given INDEX in the Holes.
#
# This relies on @Steps_Incr for it's base calculations.
####################################################################

sub Init_Steps_Zero {
  my $s           = shift;
  my $R           = $s->[R];
  my $Holes       = $s->[HOLES];
  my @Steps_Zero;
  my @Steps_Incr  = @{$s->[STEPS_INCR]};

  for (my $e = $R-1; $e >= 0; $e--) {
    $Steps_Zero[$e] = [];
    # 0th Index: Default location, takes no steps;
    $Steps_Zero[$e][0] = 0;
    # 1st Index: Same as Steps_Incr ( same base for calculation (0))
    $Steps_Zero[$e][1] = $Steps_Incr[$e][1];
    # Normally: Steps to Prev Index + Steps to Next Index
    for my $i (2..$Holes-1) {
      $Steps_Zero[$e][$i] =
        $Steps_Incr[$e][$i] +
        $Steps_Zero[$e][$i-1];
    }
  }

  $s->[STEPS_ZERO] = \@Steps_Zero;

}




####################################################################
#   COMBINATE (
#                 $COMBINATION,   # WHICH COMBINATION OF ARRAY
#              )
####################################################################
# Returns the COMBINATION th R-Combination of an ARRAY as enumerated
# in Lexicographic Order.  R and ARRAY are received by COMBINATOR
# during the INIT_COMBINATOR initialization.
####################################################################

sub Combinate {

  my $s           = shift;
  my $R           = $s->[R];
  my $N           = $s->[N];
  my $Holes       = $s->[HOLES];
  my @Master      = @{$s->[MASTER]};
  my @Abstract    = @{$s->[ABSTRACT]};

  my $Combindex   = shift;

  die "The combination $Combindex is out of range.  " .
      "Valid indexes are between $s->[MIN_COMBINDEX] and " .
      "$s->[MAX_COMBINDEX] for an array of size $s->[N]"
      if ($Combindex < $s->[MIN_COMBINDEX]
      ||  $Combindex > $s->[MAX_COMBINDEX]);

  #------------------ LINEAR ALGORITHM -----------------------------
  # We save the @Combination from previous calculations and simply
  # increment it to the next @Combination.  This basically means
  # incrementing later elements to their max, then incrementing the
  # next element and resetting its followers to its value and doing
  # it all over again.
  #-----------------------------------------------------------------

  if ($s->[COMBINDEX] + 1 == $Combindex) {
    $s->[COMBINDEX] = $Combindex;
    my $Elem = $R - 1;
    while ($Abstract[$Elem] == $Holes-1) { $Elem-- }
    my $New_Index = ++$Abstract[$Elem];
    for my $Elem2 ( $Elem+1..$R-1 ) {
      $Abstract[$Elem2] = $New_Index;
    }

  #------------------ RANDOM ALGORITHM -----------------------------
  # RANDOM goes through the elements in order and gives them as high
  # an index as possible based on the current combination.  When an
  # earlier element is pushed as far as possible, the next element
  # is started from wherever the earlier element stopped.  This 
  # keeps the elements in order.
  #
  # Calculating the steps properly involved some special math in the
  # two steps arrays and a timely subtraction of the extra index
  # gained from earlier incremented elements.
  #
  # The first thing to occur is to save the $Combindex, as it will
  # be destroyed later when calculating later elements.
  #-----------------------------------------------------------------
  } else {

    my $R               = $s->[R];
    my $N               = $s->[N];
    my $Holes           = $s->[HOLES];
    my @Steps_Zero      = @{$s->[STEPS_ZERO]};
    my $Steps           = 0;
       $s->[COMBINDEX]  = $Combindex;
       $Abstract[0]     = 0;

    ELEM:  #For all but the last element:
    for my $Elem (0..$R-2) {
      # The index of earlier elements is added to later ones
      $Abstract[$Elem] = $Abstract[$Elem-1] if $Elem > 0;
      # Temp storage of the Element's index.
      my $Index = $Abstract[$Elem];
      # Partial Steps, a subcount of steps for the element
      my $pSteps = 0;

      # An infinite loop but:
      while (1) {
        # We can break when the index reaches the maximum ($Holes)
        # Set the combination to the maximum and exit all loops.
        if ($Index == $Holes-1)   { $Combindex       = 0;
                                    $Abstract[$Elem] = $Index;
                                    next ELEM
                                  }
        # Check how many steps to increment the index one more time
        $Steps = $Steps_Zero[$Elem][$Index+1]
               - $Steps_Zero[$Elem][$Abstract[$Elem]];
        # If we have that many steps left, Increment the Index and
        # note (in $pSteps) how many steps it takes to reach it
        if ($Combindex >= $Steps) { $Index++;  $pSteps = $Steps }

        # If we do not have that many steps left, subtract the
        # pSteps from the Combindex, set the element, and move to
        # the next element to pass some indicies into.
        else                      { $Combindex      -= $pSteps;
                                    $Abstract[$Elem] = $Index;
                                    next ELEM
                                  }
      }
    }
    # Any left over steps are given to the last element.


    $Abstract[$R-1]  = $Combindex;
    $Abstract[$R-1] += $Abstract[$R-2] unless $R < 2;
  }

  #------------------ PRODUCE SUBSET FROM INDICES ------------------
  # Here the actual @Combination is produced, in the array
  # @New_Combination.  We leave @Combination untouched for use in
  # future calls to the LINEAR algorithm.
  #
  # The $Elem's $Index is added to it's own rank (0th element += 0,
  # 4th element += 4, etc...) and this number is used as an index
  # into the @Master.  The appropriate element from the @Master is
  # copied into it's place in the @New_Combination, for all elements
  # and the @New_Combination is returned as the actual SubSet.
  #-----------------------------------------------------------------


  $s->[ABSTRACT] = \@Abstract;

  my @Combination = ();
  for my $Elem (0..$R-1) {
    $Combination[$Elem] = $Master[$Abstract[$Elem] + $Elem]
  }
  return @Combination;
}


####################################################################
#   DECOMBINATE (
#                 @COMBINATION  # WHICH COMBINATION OF THE ARRAY
#               )
####################################################################
# Processes a set and determines it's Lexicographic Index as an
#   R-Combination of an Array determined in &Init_Combinator;
####################################################################

sub Decombinate {

  my $s           = shift;

  my @Combination = @{+shift};
  my $R           = $s->[R];
  my $N           = $s->[N];

#  my @Abstract   = @{$s->[ABSTRACT]};
  my @Steps_Zero = @{$s->[STEPS_ZERO]};
  my @Master     = @{$s->[MASTER]};

  die '@Combinations passed to Decombinate must be the same size ' .
      'as R from initialization.  R=' .
       $R . ' and $@Combination=' . scalar(@Combination)
      if (scalar (@Combination) != $R);

  my $Combindex;        # Lexicographical Index of the R-Combination
  my $Elem = 0;         # Element of the Return Set
  my $Index;            # Index of an individual $Elem
  my @Master_Abstract;  # Index representation of @Master
  my @Abstract;         # Index representation of @Combination
     $Abstract[$_] = 0 for (1..$R);

  # Abstract @Combination to its array indicies (causes sort)
  #  If the element is in the @Combination, the corresponding bit
  #  in the @Master_Abstract is turned on.

  for (my $i = 0; $i < $R; $i++) {
    for (my $j = 0; $j < $N;  $j++) {
      if ($Combination[$i] eq $Master[$j]) {
        $Master_Abstract[$j] = 1 ;
      }
    }
  }

  # Now, going through the $Master_Abstract in order, calculate the
  #  $Combindex.  This is, the steps to push this @Combination
  #  element to the correct element in the @Master, minus the steps
  #  it had already gained from previous elements pushing it up.

  for (my $i = 0; $i < scalar(@Master_Abstract); $i++) {
    next unless $Master_Abstract[$i];
    $Index = $Abstract[$Elem] = $i - $Elem;
    $Combindex += $Steps_Zero[$Elem][$Index]
                - $Steps_Zero[$Elem][$Abstract[$Elem-1]];
    $Elem++;
  }
  return $Combindex;
}

1;

# The POD, almost as long as the code itself, 
# has been left off.  You may find it at:
# http://code.anapraxis.net