The stupid question is the question not asked PerlMonks

### Re^2: using Statistics::Regression

by Random_Walk (Prior)
 on Apr 19, 2017 at 15:31 UTC ( #1188286=note: print w/replies, xml ) Need Help??

in reply to Re: using Statistics::Regression

And there I was thinking of the pain I would save myself using a ready cut module ;)

I must say your two lines of code does rather beat the existing docco. Of to give it a go, thanks.

Cheers,
R.

Pereant, qui ante nos nostra dixerunt!

Replies are listed 'Best First'.
Re^3: using Statistics::Regression
by Anonymous Monk on Apr 19, 2017 at 17:05 UTC
Ah, I've got it. Try this instead:
```my \$reg = Statistics::Regression->new("Pain", ["C", "X", "X**2", "X**3
+"]);
\$reg->include(\$y, [1.0, \$x, \$x**2, \$x**3]);

Thank you so much Anonymonk, now I am getting somewhere. The fragment of code I am now using, culled from a larger script goes like this ...

```    # OK, now lets use linear regression to fit
my \$reg = Statistics::Regression->new( \$data->{Name}, [ "Const", "
+Theta1", "Theta2" ] );

for ( @{\$data->{values}} ) {
# some time conversion goes on here to make times into Epo
+ch ...
my \$epoch = mktime(\$s, \$m, \$h, \$D, \$M-1, \$Y);
my \$x = \$_->[2];
print "\\$reg->include ( \$epoch, [1, \$x, ". \$x**2 ." ] )\n"
+;
\$reg->include ( \$epoch, [1, \$x, \$x**2 ] );
}
print "Results are ...\n";
\$reg->print();

Here is some output, now mostly it is working, but then on one set of data it chokes...

```# This is printed by the lines above, this one works fine...

\$reg->include ( 1491858157, [1, 95.24, 9070.6576 ] )
\$reg->include ( 1491944593, [1, 95.24, 9070.6576 ] )
\$reg->include ( 1492030986, [1, 95.22, 9066.8484 ] )
\$reg->include ( 1492117236, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492203637, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492290038, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492376435, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492462840, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492549241, [1, 95.23, 9068.7529 ] )
\$reg->include ( 1492621259, [1, 95.24, 9070.6576 ] )
Results are ...
****************************************************************
Regression '3116.dpepicqt.SYSAUX'
****************************************************************
Name                   Theta          StdErr     T-stat
[0='const']     -22405806112434.5120    16790271247707.7460       -1.3
+3
[1='Theta1']    470587750270.0963       352619498612.0823          1.3
+3
[2='Theta2']    -2470766737.1275        1851377334.2664   -1.33

R^2= 0.206, N= 10, K= 3
****************************************************************

# This one chokes ...

\$reg->include ( 1491858157, [1, 93.6, 8760.96 ] )
\$reg->include ( 1491944593, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492030986, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492117236, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492203637, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492290038, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492376435, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492462840, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492549241, [1, 93.6, 8760.96 ] )
\$reg->include ( 1492621259, [1, 93.64, 8768.4496 ] )
Results are ...
****************************************************************
Regression '3116.dpepicqt.SYSTEM'
****************************************************************
Report.pl::Statistics::Regression:standarderrors: I cannot compute the
+ theta-covariance matrix for variable 3 0
at C:/Perl64/site/lib/Statistics/Regression.pm line 619.
Statistics::Regression::standarderrors(Statistics::Regression=
+HASH(0x44dfe90)) called at C:/Perl64/site/lib/Statistics/Regression.p
+m line 430
Statistics::Regression::print(Statistics::Regression=HASH(0x44
+dfe90)) called at Report.pl line 125
main::predict(HASH(0x4340ec8), 10) called at Report.pl line 85

I am guessing I may not have enough variation in that data for it to find an optimum, but if anyone can see I am barking up the wrong tree, please do shout

Cheers,
R.

Pereant, qui ante nos nostra dixerunt!

### Update

I have now tried it with a cubic term, and it failed on an earlier data set. Then I tried it with just the Constant and an X terms, no square or higher, and it ran the complete set. So now I can get a best fit line. Next step is to see if I can feed it some guess values for the theta vector.
You need at least 3 distinct values of x to produce a quadratic fit. Similarly, you need at least 4 for a cubic, and at least 2 for a linear fit.

Create A New User
Node Status?
node history
Node Type: note [id://1188286]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2021-04-11 01:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?