comment on

To me, solving this problem is all about finding the right language.

As expressed in the OP, Knuth expressions like floor(sqrt((3!)!)) are clunky. The eye gets lost in ! and () and it hard to see structure.

But inspiration hits: every Knuth expression is just a highly nested function of 3, and functions are evaluated from the inside out. So we could use a reverse Polish notation to simplify things. Set f = factorial, s = sqrt and i = floor(int). Then 26 = 3ffsi and in general, Knuth expressions can be represented by strings of the form 3[isf]*. Much simpler!

This representation immediately suggests a solution to the problem: generate all strings [isf]* up to length n and evaluate to each string to see if we have a small integer. Simple, but inefficient. The problem is that the number of strings generated grows as 3**n, and even for small n this search can take long, long time.

This situation can be helped by thinking about the nature of the functions used. Obviously, ii == i, so any sequence i+ can be reduced to a single i. As japhy points out, any sequence of square roots and floors [si]+i can drop the floors to become s+i without affecting the result. And so we can reduce our language of Knuth expressions to the form 3(s|if)+. The number of strings in this language grow as x**n with x somewhat less than 2. This is much better than 3**n and so I wrote a program that was able solve the problem in about 2 hours on my poky PIII notebook:

use warnings;
use strict;

use Algorithm::Loops qw( NestedLoops );
use Math::BigInt lib => 'GMP';

my $start_base = shift;
my $depth = shift;

my @func = qw/i s f/;  # alphabet used
# my @func = qw/s if/;
my @seq;   # Stores shortest sequences associated with each positive i
+nteger [1,200]
$seq[3] = "";

my @todo;  # Stores base elements yet to be searched.
push @todo, $start_base; # Start things off

while (@todo) {
   my $base = shift @todo;
   print "expanding $base\n";
   expand( $base, $depth);
}

for my $val (1..200) {
   print "$val: $seq[$val]\n" if defined $seq[$val];
}


### Subroutines

# Create all possible strings up to $depth long and evaluate each stri
+ng
sub expand
{
   my ($base, $depth) = @_;

   NestedLoops(
               [ ( [@func] ) x $depth ],
               { OnlyWhen => 1 },
               sub{ evaluate_bigint( $base, @_); },
              );
}

# Evaluate the result of computing the expression given by a string
# Adds useful strings to @seq and pushes newly found values onto @todo
# queue.
sub evaluate_bigint
{
   my $base = shift;
   my $val = Math::BigInt->new( $base);  # create a BigInt object

   my $seq_str = join "", @_;
   my @sequence = split //, $seq_str;
   my $prefix = defined $seq[$base] ? $seq[$base] : "";

   foreach my $f (@sequence) {
      if ($f eq 'i') {    # int
         # int is a no-op for a BigInt
      }
      elsif ($f eq 's') { # sqrt
         $val->bsqrt();
      }
      else {              # factorial
         return if $val->length() > 5;  # limit max size
         $val->bfac();
      }
   }
   return if $val->length() > 3;  # only want small integers
   my $num = $val->numify();      # convert to ordinary scalar

   $seq_str = $prefix . $seq_str;
   $seq_str =~ s/^i+//;
   $seq_str =~ s/i+/i/g;
   $seq_str =~ s/fif/ff/g;

   if (! defined($seq[$num])) { # new entry
      push @todo, $num;
      $seq[$num] = $seq_str;
      print "$num: $seq_str\n";
   }
   elsif (length $seq_str < length $seq[$num]) {
      $seq[$num] = $seq_str;
   }
}
[download]

This code evaluates strings using the Math::BigInt GMP library in order to handle large factorials. It also uses a memoization technique to reduce the search space. The idea is that starting with 3 and applying all possible strings up to length n yields several small integers as Knuth expressions, e.g., 5 = 3ffssi. Then one can search starting with the small integers, e.g., 5 to yield new small integers, and so on.

This memoization trick, however, leads to another big insight: each Knuth sequence that we have the ability to compute can be described by a sequence of small integers found as we evaluate the string. Between each pair of numbers is an ascent to larger numbers created by f*, followed a descent into smaller numbers created by s*, followed by an i. This yields a further simplification of our language: to map from small integer to small integer, search strings of the form f*s*i. There are just n length n strings of the form f*s*i, versus the 2**n before, so we get a massive speedup.

Here is the same program as before with our newly optimized language:

use warnings;
use strict;

use Algorithm::Loops qw( NestedLoops );
use Math::BigInt lib => 'GMP';

my $start_base = shift;
my $depth = shift;

my @seq;   # Stores shortest sequences associated with each positive i
+nteger [1,200]
$seq[3] = "";

my @todo;  # Stores base elements yet to be searched.
push @todo, $start_base; # Start things off

while (@todo) {
   my $base = shift @todo;
   print "expanding $base\n";
   expand( $base, $depth);
}

for my $val (1..200) {
   print "$val: $seq[$val]\n" if defined $seq[$val];
}


### Subroutines

# Create all possible strings up to $depth long and evaluate each stri
+ng
sub expand
{
   my ($base, $depth) = @_;

   for my $len (1..$depth) {
      my $max_f = $len < 3 ? $len : 3;
      for my $f (0..$max_f) {
         my $str = 'f'x$f . 's'x($len-$f);
         evaluate_bigint( $base, $str);
      }
   }
}

# Evaluate the result of computing the expression given by a string
# Adds useful strings to @seq and pushes newly found values onto @todo
# queue.
sub evaluate_bigint
{
   my $base = shift;
   my $val = Math::BigInt->new( $base);  # create a BigInt object

   my $seq_str = shift;
   my $prefix = defined $seq[$base] ? $seq[$base] : "";
#    my $str = $prefix . $seq_str;
#    print "base: $base,  seq_str: $seq_str,  str: $str\n";

   my @sequence = split //, $seq_str;

   foreach my $f (@sequence) {
      if ($f eq 'i') {    # int
         # int is a no-op for a BigInt
      }
      elsif ($f eq 's') { # sqrt
         $val->bsqrt();
      }
      else {              # factorial
         return if $val->length() > 3;  # limit max size
         #  print "length before fac:", $val->length(), "\n";
         $val->bfac();
      }
   }
   return if $val->length() > 3;  # only want small integers
   my $num = $val->numify();       # convert to ordinary scalar

   $seq_str = $prefix . $seq_str;

   if (! defined($seq[$num])) { # new entry
      push @todo, $num;
      $seq[$num] = $seq_str;
      print "$num: $seq_str\n";
   }
   elsif (length $seq_str < length $seq[$num]) {
      $seq[$num] = $seq_str;
   }
}
[download]

This runs in 1.4 seconds -- a speedup of 5000!

Update: Corrected a spelling mistake, thanks to gam3.

-Mark

In reply to Re: Challenge: Chasing Knuth's Conjecture by kvale
in thread Challenge: Chasing Knuth's Conjecture by kvale

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


good chemistry is complicated, and a little bit messy -LW
	PerlMonks