To me, solving this problem is all about finding the right language.
As expressed in the OP, Knuth expressions like floor(sqrt((3!)!)) are
clunky. The eye gets lost in ! and () and it hard to see structure.
But inspiration hits: every Knuth expression is just a highly nested
function of 3, and functions are evaluated from the inside out. So we
could use a reverse Polish notation to simplify things. Set f =
factorial, s = sqrt and i = floor(int). Then 26 = 3ffsi and in
general, Knuth expressions can be represented by strings of the form
3[isf]*. Much simpler!
This representation immediately suggests a solution to the problem:
generate all strings [isf]* up to length n and evaluate to each string
to see if we have a small integer. Simple, but inefficient. The
problem is that the number of strings generated grows as 3**n, and
even for small n this search can take long, long time.
This situation can be helped by thinking about the nature of the
functions used. Obviously, ii == i, so any sequence i+ can be reduced
to a single i. As japhy points out, any sequence of square roots and
floors [si]+i can drop the floors to become s+i without affecting the
result. And so we can reduce our language of Knuth expressions to the
form 3(s|if)+. The number of strings in this language grow as x**n
with x somewhat less than 2. This is much better than 3**n and so I
wrote a program that was able solve the problem in about 2 hours on my
poky PIII notebook:
use warnings;
use strict;
use Algorithm::Loops qw( NestedLoops );
use Math::BigInt lib => 'GMP';
my $start_base = shift;
my $depth = shift;
my @func = qw/i s f/; # alphabet used
# my @func = qw/s if/;
my @seq; # Stores shortest sequences associated with each positive i
+nteger [1,200]
$seq[3] = "";
my @todo; # Stores base elements yet to be searched.
push @todo, $start_base; # Start things off
while (@todo) {
my $base = shift @todo;
print "expanding $base\n";
expand( $base, $depth);
}
for my $val (1..200) {
print "$val: $seq[$val]\n" if defined $seq[$val];
}
### Subroutines
# Create all possible strings up to $depth long and evaluate each stri
+ng
sub expand
{
my ($base, $depth) = @_;
NestedLoops(
[ ( [@func] ) x $depth ],
{ OnlyWhen => 1 },
sub{ evaluate_bigint( $base, @_); },
);
}
# Evaluate the result of computing the expression given by a string
# Adds useful strings to @seq and pushes newly found values onto @todo
# queue.
sub evaluate_bigint
{
my $base = shift;
my $val = Math::BigInt->new( $base); # create a BigInt object
my $seq_str = join "", @_;
my @sequence = split //, $seq_str;
my $prefix = defined $seq[$base] ? $seq[$base] : "";
foreach my $f (@sequence) {
if ($f eq 'i') { # int
# int is a no-op for a BigInt
}
elsif ($f eq 's') { # sqrt
$val->bsqrt();
}
else { # factorial
return if $val->length() > 5; # limit max size
$val->bfac();
}
}
return if $val->length() > 3; # only want small integers
my $num = $val->numify(); # convert to ordinary scalar
$seq_str = $prefix . $seq_str;
$seq_str =~ s/^i+//;
$seq_str =~ s/i+/i/g;
$seq_str =~ s/fif/ff/g;
if (! defined($seq[$num])) { # new entry
push @todo, $num;
$seq[$num] = $seq_str;
print "$num: $seq_str\n";
}
elsif (length $seq_str < length $seq[$num]) {
$seq[$num] = $seq_str;
}
}
This code evaluates strings using the
Math::BigInt GMP
library in order to handle large factorials. It also uses a
memoization technique to reduce the search space. The idea is that
starting with 3 and applying all possible strings up to length n yields
several small integers as Knuth expressions, e.g., 5 = 3ffssi. Then
one can search starting with the small integers, e.g., 5 to yield new
small integers, and so on.
This memoization trick, however, leads to another big insight: each Knuth
sequence that we have the ability to compute can be described by a
sequence of small integers found as we evaluate the string. Between
each pair of numbers is an ascent to larger numbers created by f*,
followed a descent into smaller numbers created by s*, followed by an i.
This yields a further simplification of our language: to map from
small integer to small integer, search strings of the form
f*s*i. There are just n length n strings of the form f*s*i, versus the 2**n
before, so we get a massive speedup.
Here is the same program as before with our newly optimized language:
use warnings;
use strict;
use Algorithm::Loops qw( NestedLoops );
use Math::BigInt lib => 'GMP';
my $start_base = shift;
my $depth = shift;
my @seq; # Stores shortest sequences associated with each positive i
+nteger [1,200]
$seq[3] = "";
my @todo; # Stores base elements yet to be searched.
push @todo, $start_base; # Start things off
while (@todo) {
my $base = shift @todo;
print "expanding $base\n";
expand( $base, $depth);
}
for my $val (1..200) {
print "$val: $seq[$val]\n" if defined $seq[$val];
}
### Subroutines
# Create all possible strings up to $depth long and evaluate each stri
+ng
sub expand
{
my ($base, $depth) = @_;
for my $len (1..$depth) {
my $max_f = $len < 3 ? $len : 3;
for my $f (0..$max_f) {
my $str = 'f'x$f . 's'x($len-$f);
evaluate_bigint( $base, $str);
}
}
}
# Evaluate the result of computing the expression given by a string
# Adds useful strings to @seq and pushes newly found values onto @todo
# queue.
sub evaluate_bigint
{
my $base = shift;
my $val = Math::BigInt->new( $base); # create a BigInt object
my $seq_str = shift;
my $prefix = defined $seq[$base] ? $seq[$base] : "";
# my $str = $prefix . $seq_str;
# print "base: $base, seq_str: $seq_str, str: $str\n";
my @sequence = split //, $seq_str;
foreach my $f (@sequence) {
if ($f eq 'i') { # int
# int is a no-op for a BigInt
}
elsif ($f eq 's') { # sqrt
$val->bsqrt();
}
else { # factorial
return if $val->length() > 3; # limit max size
# print "length before fac:", $val->length(), "\n";
$val->bfac();
}
}
return if $val->length() > 3; # only want small integers
my $num = $val->numify(); # convert to ordinary scalar
$seq_str = $prefix . $seq_str;
if (! defined($seq[$num])) { # new entry
push @todo, $num;
$seq[$num] = $seq_str;
print "$num: $seq_str\n";
}
elsif (length $seq_str < length $seq[$num]) {
$seq[$num] = $seq_str;
}
}
This runs in 1.4 seconds -- a speedup of 5000!
Update: Corrected a spelling mistake, thanks to gam3.