comment on

BTW, this test case is wrong:

    [[0, 1, 1, 2, 3],       8],
[download]

Here is my approach, which will at least identify the homogenous linear recurrences (basically everything here except the factorial example). The main work is done by candidate_recurrence. A degree-D linear recurrence is something like this:

a(n) = c(1)a(n-1) + ... + c(D)a(n-D)

where the c(i) values are constant coefficients. Any 2D-1 values uniquely define a degree-D linear recurrence, and that recurrence is what candidate_recurrence computes. It also takes another parameter which says whether to consider a recurrence with a constant additive term as well, like

a(n) = c(1)a(n-1) + ... + c(D)a(n-D) + c(0)

In either case, it's just some basic linear algebra, for which the code currently uses PDL to solve.

So given this candidate_recurrence, the sequence-identification algorithm is pretty simple. Just start with D very low and keep increasing it until you reach one that works with the entire sequence (low D = low complexity, so this is a kind of Occam's razor simplest (linear) explanation for the sequence).

This works for all of your examples except for the factorials (not a homogenous linear recurrence). For test cases (1) and (1,2), it gives reasonable answers right now (all-ones sequence, and powers of two respectively), where your test wants it to fail. And for test case (1,2,3,1), there are precision issues with PDL's linear algebra, which is unfortunate. Still, all we need is to take the inverse of an integer matrix. There are no inherent issues with such a computation in general, I just didn't want to do it by hand, and didn't realize that PDL had precision issues!

#!/usr/bin/perl

use PDL;
use List::Util;

sub candidate_recurrence {
    my ($d, $lin, @values) = @_;
    my $N = $d+$lin;
    my $samp = 2*$N -$lin; ## number of samples we'll need
    
    ## some base cases
    return if $N == 0;
    return Recurrence->new( 0, 1, $values[0] ) if $d == 0 and $lin == 
+1;
    return if @values < $samp; # need enough points
    
    if ($d == 1 and $lin == 0) {
      return $values[0] ? Recurrence->new( 1, 0, $values[1] / $values[
+0] ) : undef;
    }

    my @samples = @values[ 0 .. $samp-1 ];

    my @matrix;
    for (0 .. $N-1) {
        push @matrix, [ @samples[ $_ .. $_+$d-1 ] ];
    }
    if ($lin) {
      push @$_, 1 for @matrix;
    }

    my $M = pdl @matrix;
    my $v = pdl( [ @samples[ $d .. $samp-1 ] ] )->transpose;
    
    return if $M->det == 0;
    return Recurrence->new( $d, $lin, list( $M->inv() x $v ) );
}

sub check_recurrence {
    my $R = shift;
    for my $n ($R->d .. $#_-1) {
        return if $R->next( @_[ 0 .. $n ] ) != $_[$n+1];
    }
    return 1;
}

sub identify {
    for my $d (0 .. 2) {
        for my $lin (0 .. 1) {
            my $R = candidate_recurrence($d, $lin, @_);
            return $R if $R and check_recurrence($R, @_);
        }
    }
}


sub series {
    my $R = identify(@_) or return;
    return $R->next(@_);
}

############
## cute recurrence class to make things simpler

{
 package Recurrence;
 sub new {
    my ($pkg, $d, $lin, @coeff) = @_;
    my $const = $lin ? pop @coeff : 0;
    return bless { d=>$d, lin=>$lin, coeff=>\@coeff, const=>$const }, 
+$pkg;
 }
  
 sub next {
    my $self = shift;
    return if @_ < $self->{d};
    my @window =  @_[ -$self->{d} .. -1 ];
    
    $self->{const}
      + List::Util::sum map { $self->{coeff}[$_] * $window[$_] } 0 .. 
+$#window;    
 }
  
 sub print {
    my $self = shift;
    my $d = $self->d;
    my @terms = grep { !/^0\*/ }
                map { $self->{coeff}[$d-$_] . "*a(n-$_)" } 1 .. $self-
+>{d};
    push @terms, $self->{const} if $self->{const} || @terms == 0;
    my $str = "a(n) = " . join " + ", @terms;
    $str =~ s/\b 1\*//gx;
    return $str; }
  
 sub d { return $_[0]->{d}; }
}
[download]

As for extending this to work with more general sequences, it seems like it would be quite difficult to get something that works with too many more classes of sequences. Fortunately linear recurrences encompass many common use cases. When your "magic star" recognizes 1, 1, 2, 5, 14, 42, 132 as the first few Catalan numbers, and 2, 6, 20, 70, 252 as the first few central binomial coefficients, you'll be onto something special ;)

Update: Replacing your main loop with the following

for my $t (@tests) {
    my @list = @{$t->[0]};
    print "@list : ";
    my $R   = identify(@list);
    if ($R) {
        print "identified as ", $R->print, " : next = ", $R->next(@lis
+t), $/;
    } else {
        print "??\n";
    }
}
[download]

.. gives some more verbose output:

1 : identified as a(n) = 1 : next = 1
1 1 : identified as a(n) = 1 : next = 1
0 0 : identified as a(n) = 0 : next = 0
1 2 : identified as a(n) = 2*a(n-1) : next = 4
0 1 2 : identified as a(n) = a(n-1) + 1 : next = 3
1 0 -1 : identified as a(n) = a(n-1) + -1 : next = -2
1 2 3 : identified as a(n) = a(n-1) + 1 : next = 4
1 2 4 : identified as a(n) = 2*a(n-1) : next = 8
2 4 8 : identified as a(n) = 2*a(n-1) : next = 16
1 3 9 : identified as a(n) = 3*a(n-1) : next = 27
1 -1 1 -1 : identified as a(n) = -a(n-1) : next = 1
-1 1 -1 1 : identified as a(n) = -a(n-1) : next = -1
1 0 1 0 : identified as a(n) = -a(n-1) + 1 : next = 1
0 1 0 1 : identified as a(n) = -a(n-1) + 1 : next = 0
1 1 2 3 5 : identified as a(n) = a(n-1) + a(n-2) : next = 8
0 1 1 2 3 : identified as a(n) = a(n-1) + a(n-2) : next = 5
1 2 3 5 8 : identified as a(n) = a(n-1) + a(n-2) : next = 13
1 2 6 24 120 : ??
1 0 0 1 : ??
1 2 3 1 : identified as a(n) = 5*a(n-1) + -7*a(n-2) : next = -16
1 3 5 : identified as a(n) = a(n-1) + 2 : next = 7
2 4 6 : identified as a(n) = a(n-1) + 2 : next = 8
[download]

blokhead

In reply to Re: Challenge: Simple algorithm for continuing series of integers by blokhead
in thread Challenge: Simple algorithm for continuing series of integers by moritz

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Problems? Is your data what you think it is?
	PerlMonks