http://qs321.pair.com?node_id=718098


in reply to Challenge: Simple algorithm for continuing series of integers

BTW, this test case is wrong:
[[0, 1, 1, 2, 3], 8],
Here is my approach, which will at least identify the homogenous linear recurrences (basically everything here except the factorial example). The main work is done by candidate_recurrence. A degree-D linear recurrence is something like this:
a(n) = c(1)a(n-1) + ... + c(D)a(n-D)
where the c(i) values are constant coefficients. Any 2D-1 values uniquely define a degree-D linear recurrence, and that recurrence is what candidate_recurrence computes. It also takes another parameter which says whether to consider a recurrence with a constant additive term as well, like
a(n) = c(1)a(n-1) + ... + c(D)a(n-D) + c(0)
In either case, it's just some basic linear algebra, for which the code currently uses PDL to solve.

So given this candidate_recurrence, the sequence-identification algorithm is pretty simple. Just start with D very low and keep increasing it until you reach one that works with the entire sequence (low D = low complexity, so this is a kind of Occam's razor simplest (linear) explanation for the sequence).

This works for all of your examples except for the factorials (not a homogenous linear recurrence). For test cases (1) and (1,2), it gives reasonable answers right now (all-ones sequence, and powers of two respectively), where your test wants it to fail. And for test case (1,2,3,1), there are precision issues with PDL's linear algebra, which is unfortunate. Still, all we need is to take the inverse of an integer matrix. There are no inherent issues with such a computation in general, I just didn't want to do it by hand, and didn't realize that PDL had precision issues!

#!/usr/bin/perl use PDL; use List::Util; sub candidate_recurrence { my ($d, $lin, @values) = @_; my $N = $d+$lin; my $samp = 2*$N -$lin; ## number of samples we'll need ## some base cases return if $N == 0; return Recurrence->new( 0, 1, $values[0] ) if $d == 0 and $lin == +1; return if @values < $samp; # need enough points if ($d == 1 and $lin == 0) { return $values[0] ? Recurrence->new( 1, 0, $values[1] / $values[ +0] ) : undef; } my @samples = @values[ 0 .. $samp-1 ]; my @matrix; for (0 .. $N-1) { push @matrix, [ @samples[ $_ .. $_+$d-1 ] ]; } if ($lin) { push @$_, 1 for @matrix; } my $M = pdl @matrix; my $v = pdl( [ @samples[ $d .. $samp-1 ] ] )->transpose; return if $M->det == 0; return Recurrence->new( $d, $lin, list( $M->inv() x $v ) ); } sub check_recurrence { my $R = shift; for my $n ($R->d .. $#_-1) { return if $R->next( @_[ 0 .. $n ] ) != $_[$n+1]; } return 1; } sub identify { for my $d (0 .. 2) { for my $lin (0 .. 1) { my $R = candidate_recurrence($d, $lin, @_); return $R if $R and check_recurrence($R, @_); } } } sub series { my $R = identify(@_) or return; return $R->next(@_); } ############ ## cute recurrence class to make things simpler { package Recurrence; sub new { my ($pkg, $d, $lin, @coeff) = @_; my $const = $lin ? pop @coeff : 0; return bless { d=>$d, lin=>$lin, coeff=>\@coeff, const=>$const }, +$pkg; } sub next { my $self = shift; return if @_ < $self->{d}; my @window = @_[ -$self->{d} .. -1 ]; $self->{const} + List::Util::sum map { $self->{coeff}[$_] * $window[$_] } 0 .. +$#window; } sub print { my $self = shift; my $d = $self->d; my @terms = grep { !/^0\*/ } map { $self->{coeff}[$d-$_] . "*a(n-$_)" } 1 .. $self- +>{d}; push @terms, $self->{const} if $self->{const} || @terms == 0; my $str = "a(n) = " . join " + ", @terms; $str =~ s/\b 1\*//gx; return $str; } sub d { return $_[0]->{d}; } }

As for extending this to work with more general sequences, it seems like it would be quite difficult to get something that works with too many more classes of sequences. Fortunately linear recurrences encompass many common use cases. When your "magic star" recognizes 1, 1, 2, 5, 14, 42, 132 as the first few Catalan numbers, and 2, 6, 20, 70, 252 as the first few central binomial coefficients, you'll be onto something special ;)

Update: Replacing your main loop with the following

for my $t (@tests) { my @list = @{$t->[0]}; print "@list : "; my $R = identify(@list); if ($R) { print "identified as ", $R->print, " : next = ", $R->next(@lis +t), $/; } else { print "??\n"; } }
.. gives some more verbose output:
1 : identified as a(n) = 1 : next = 1 1 1 : identified as a(n) = 1 : next = 1 0 0 : identified as a(n) = 0 : next = 0 1 2 : identified as a(n) = 2*a(n-1) : next = 4 0 1 2 : identified as a(n) = a(n-1) + 1 : next = 3 1 0 -1 : identified as a(n) = a(n-1) + -1 : next = -2 1 2 3 : identified as a(n) = a(n-1) + 1 : next = 4 1 2 4 : identified as a(n) = 2*a(n-1) : next = 8 2 4 8 : identified as a(n) = 2*a(n-1) : next = 16 1 3 9 : identified as a(n) = 3*a(n-1) : next = 27 1 -1 1 -1 : identified as a(n) = -a(n-1) : next = 1 -1 1 -1 1 : identified as a(n) = -a(n-1) : next = -1 1 0 1 0 : identified as a(n) = -a(n-1) + 1 : next = 1 0 1 0 1 : identified as a(n) = -a(n-1) + 1 : next = 0 1 1 2 3 5 : identified as a(n) = a(n-1) + a(n-2) : next = 8 0 1 1 2 3 : identified as a(n) = a(n-1) + a(n-2) : next = 5 1 2 3 5 8 : identified as a(n) = a(n-1) + a(n-2) : next = 13 1 2 6 24 120 : ?? 1 0 0 1 : ?? 1 2 3 1 : identified as a(n) = 5*a(n-1) + -7*a(n-2) : next = -16 1 3 5 : identified as a(n) = a(n-1) + 2 : next = 7 2 4 6 : identified as a(n) = a(n-1) + 2 : next = 8

blokhead