Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: using Statistics::Regression

by Random_Walk (Prior)
on Apr 19, 2017 at 15:31 UTC ( #1188286=note: print w/replies, xml ) Need Help??


in reply to Re: using Statistics::Regression
in thread using Statistics::Regression

And there I was thinking of the pain I would save myself using a ready cut module ;)

I must say your two lines of code does rather beat the existing docco. Of to give it a go, thanks.

Cheers,
R.

Pereant, qui ante nos nostra dixerunt!

Replies are listed 'Best First'.
Re^3: using Statistics::Regression
by Anonymous Monk on Apr 19, 2017 at 17:05 UTC
    Ah, I've got it. Try this instead:
    my $reg = Statistics::Regression->new("Pain", ["C", "X", "X**2", "X**3 +"]); $reg->include($y, [1.0, $x, $x**2, $x**3]);

      Thank you so much Anonymonk, now I am getting somewhere. The fragment of code I am now using, culled from a larger script goes like this ...

      # OK, now lets use linear regression to fit my $reg = Statistics::Regression->new( $data->{Name}, [ "Const", " +Theta1", "Theta2" ] ); # Add data points for ( @{$data->{values}} ) { # some time conversion goes on here to make times into Epo +ch ... my $epoch = mktime($s, $m, $h, $D, $M-1, $Y); my $x = $_->[2]; print "\$reg->include ( $epoch, [1, $x, ". $x**2 ." ] )\n" +; $reg->include ( $epoch, [1, $x, $x**2 ] ); } print "Results are ...\n"; $reg->print();

      Here is some output, now mostly it is working, but then on one set of data it chokes...

      # This is printed by the lines above, this one works fine... $reg->include ( 1491858157, [1, 95.24, 9070.6576 ] ) $reg->include ( 1491944593, [1, 95.24, 9070.6576 ] ) $reg->include ( 1492030986, [1, 95.22, 9066.8484 ] ) $reg->include ( 1492117236, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492203637, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492290038, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492376435, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492462840, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492549241, [1, 95.23, 9068.7529 ] ) $reg->include ( 1492621259, [1, 95.24, 9070.6576 ] ) Results are ... **************************************************************** Regression '3116.dpepicqt.SYSAUX' **************************************************************** Name Theta StdErr T-stat [0='const'] -22405806112434.5120 16790271247707.7460 -1.3 +3 [1='Theta1'] 470587750270.0963 352619498612.0823 1.3 +3 [2='Theta2'] -2470766737.1275 1851377334.2664 -1.33 R^2= 0.206, N= 10, K= 3 **************************************************************** # This one chokes ... $reg->include ( 1491858157, [1, 93.6, 8760.96 ] ) $reg->include ( 1491944593, [1, 93.6, 8760.96 ] ) $reg->include ( 1492030986, [1, 93.6, 8760.96 ] ) $reg->include ( 1492117236, [1, 93.6, 8760.96 ] ) $reg->include ( 1492203637, [1, 93.6, 8760.96 ] ) $reg->include ( 1492290038, [1, 93.6, 8760.96 ] ) $reg->include ( 1492376435, [1, 93.6, 8760.96 ] ) $reg->include ( 1492462840, [1, 93.6, 8760.96 ] ) $reg->include ( 1492549241, [1, 93.6, 8760.96 ] ) $reg->include ( 1492621259, [1, 93.64, 8768.4496 ] ) Results are ... **************************************************************** Regression '3116.dpepicqt.SYSTEM' **************************************************************** Report.pl::Statistics::Regression:standarderrors: I cannot compute the + theta-covariance matrix for variable 3 0 at C:/Perl64/site/lib/Statistics/Regression.pm line 619. Statistics::Regression::standarderrors(Statistics::Regression= +HASH(0x44dfe90)) called at C:/Perl64/site/lib/Statistics/Regression.p +m line 430 Statistics::Regression::print(Statistics::Regression=HASH(0x44 +dfe90)) called at Report.pl line 125 main::predict(HASH(0x4340ec8), 10) called at Report.pl line 85

      I am guessing I may not have enough variation in that data for it to find an optimum, but if anyone can see I am barking up the wrong tree, please do shout

      Cheers,
      R.

      Pereant, qui ante nos nostra dixerunt!

      Update

      I have now tried it with a cubic term, and it failed on an earlier data set. Then I tried it with just the Constant and an X terms, no square or higher, and it ran the complete set. So now I can get a best fit line. Next step is to see if I can feed it some guess values for the theta vector.
        You need at least 3 distinct values of x to produce a quadratic fit. Similarly, you need at least 4 for a cubic, and at least 2 for a linear fit.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1188286]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2021-04-11 01:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?