Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

to calculate mean and variance

by cdfd123 (Initiate)
on Jan 11, 2008 at 02:29 UTC ( [id://661777]=perlquestion: print w/replies, xml ) Need Help??

cdfd123 has asked for the wisdom of the Perl Monks concerning the following question:

Suppose u have a file
A)1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8 B)-1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5 <c> continue.... That is in a file where in each line have to calculate mean and varia +nce where A, B can ignored as just saying that u have separate line +where each line have input data points in one file program is <code> #!/usr/bin/perl -w open( FILE, "< file_1" ) or die "Can't open file_1 : $!"; while( <FILE> ){ @fields = split / /; for($i=0;$i < scalar(@fields); $i++ ){ $sum[$i]+=$fields[$i]; $sumsq[$i]+=$fields[$i]*$fields[$i]; } $n++; } for($i=0;$i < scalar(@sum); $i++ ){ $sum[$i] /= $n; $sumsq[$i] /= $n; $stddev = sqrt( $sumsq[$i] - $sum[$i]*$sum[$i] ); print( $sum[$i]." ".$stddev." " ); } close FILE </code> But while running the program error occured <c> rgument "" isn't numeric in addition (+) at meanStddev.pl line 6, <FIL +E> line 1. Argument "" isn't numeric in addition (+) at meanStddev.pl line 6, <FI +LE> line 1. Argument "" isn't numeric in addition (+) at meanStddev.pl Argument "\n" isn't numeric in addition (+) at meanStddev.pl line 6, < +FILE> line 3. 0.573333333333333 0.867998975933856 0 0 0.833333333333333 1.1785113019 +7758 0 0 1.26666666666667 1.79133717900592 0 0 0.633333333333333 0.89 +566858950296 0 0 -1.16666666666667 1.64991582276861 0 0 -1.1666666666 +6667 1.64991582276861 0 0 1.06666666666667 1.5084944665313 0 0 -1.3 1 +.83847763108502 0 0 1.4 1.97989898732233 0 0 1.5 2.12132034355964 0 0 + 0.933333333333333 1.31993265821489 0 0 -0.433333333333333 0.61282587 +7028341 0 0 -0.3 0.424264068711929 0 0 -0.233333333333333 0.329983164 +553722 0 0 -0.133333333333333 0.188561808316413 0 0 -0.26666666666666 +7 0.377123616632825 0 0 -1.16666666666667 1.64991582276861 0 0 -1.166 +66666666667 1.64991582276861 0 0 -0.533333333333333 0.754247233265651 + 0 0 -1.5 2.12132034355964
any suggestions

20080114 Janitored by Corion: Changed bold tags to code tags

Replies are listed 'Best First'.
Re: to calculate mean and variance
by davidrw (Prior) on Jan 11, 2008 at 03:01 UTC
    couple general suggestions:
    • use strict; At first, it'll generate a bunch of errors for you for undeclared variables, but it will help identify & reduce errors.
    • Add some debugging statements. (Data::Dumper can be quite helpful, too)
      e.g. the warnings indicate that one of the values it's trying to add to the sum is a string .. so in the first for loop put something like print Dumper  [$i, $fields[$i]] if $fields[$i] =~ /s/;
    Hmm .. actually, it might be your split -- try split(' ') instead -- see the split() docs for full info, including that "split(/ /)" will give you as many null initial fields as there are leading spaces. so it could be that your data file has leading spaces in it somewhere.

    Here's a little refactoring example, too, to demo several more "perlish" constructs:
    #!/usr/bin/perl -w use strict; my @sum; my @sumsq; my $n = 0; while( <DATA> ){ my @fields = split / /; foreach my $i ( 0..$#fields ){ $sum[$i] += $fields[$i]; $sumsq[$i] += $fields[$i]**2; } $n++; } $_ /= $n for @sum, @sumsq; my @stddev = map { sqrt( $sumsq[$_] - $sum[$_]**2 ) } 0 .. $#sum; print join(" ", @stddev) . "\n"; #foreach my $i ( 0 .. $#sum ){ # printf "%5s %10s %10s\n", $i, $sum[$i], $stddev[$i]; #} __DATA__ 1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8 -1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5
      Thanks may be misconception atyually i want to calculate along the rows __DATA__ 1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8 -----> calculate mean and variance -1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5 ---> calculate mean and variance regards
        First, can you use <code></code> tags to display your data? It's hard to see exactly what you're using when it's normal text ...

        Ah -- along rows is even easier .. here's an example:
        #!/usr/bin/perl -w use strict; my @stddev; while( <DATA> ){ my @fields = split ' ', $_; my $N = scalar(@fields); my $sum = 0; $sum += $_ for @fields; my $mean = $sum / $N; $sum = 0; $sum += ($_ - $mean)**2 for @fields; push @stddev, sqrt($sum/$N); } print map {"$_\n"} @stddev; __DATA__ 1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8 -1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5
Re: to calculate mean and variance
by graff (Chancellor) on Jan 11, 2008 at 03:20 UTC
    What davidrw said about changing the split statement is bound to be the solution. I'd also point out this part of the error message:
    ... at meanStddev.pl line 6, <FILE> line 1
    The message tells you not only where in your script you had a problem (line 6 of the perl code: $sum[$i]+=$fields[$i];), but also which line of data in your input data file had just been read when the error occurred. That is, the initial space is in line 1 of the data.
Re: to calculate mean and variance
by CountZero (Bishop) on Jan 11, 2008 at 06:26 UTC
    Or if you do not want to reinvent a wheel: Statistics::Lite, Statistics::Basic or Statistics::Descriptive.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://661777]
Approved by jettero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-25 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found