Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

creating and printing a sliding window

by Angharad (Pilgrim)
on Mar 04, 2009 at 14:40 UTC ( [id://748175]=perlquestion: print w/replies, xml ) Need Help??

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that looks like this.
1 0 0.00 0 0 0 2 0 0.00 0 0 0 3 0 0.08 0 0 0 4 0 0.05 0 0 0 5 0 0.08 0 0 0 6 0 0.05 0 0.12 0 7 0 0.05 0 0.12 0 8 0 0.04 0 0.15 0 9 0.07 0.07 0 0.15 0.18 10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58 15 0.39 0.05 0.41 0.00 0.37
etc etc where the first column is a position number and the other columns my data of interest. I want to create a 'sliding window' whereby all the data points within 5 positions are taken into account and then print off the highest of the scores for my 5 data points within those 5 positions - for example for the data:
10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58
I would print out
10-14 0.57 0.07 0.62 0.32 0.58
This is what I've attempted so far - which simply doesnt work. I cant get the count to increment. Do I need to use an array instead of using a while loop to go though each line of the file one at a time?
#!/usr/bin/perl -w use strict; use warnings; use English; use FileHandle; use Exception; my $input = shift; my $count = 0; my $largest_cons1 = 0; my $largest_cons2 = 0; my $largest_cons3 = 0; my $largest_cons4 = 0; my $largest_cons5 = 0; open(FILE, "$input") || die "ERROR: Unable to open input file: $!\n"; while(<FILE>) { my @data = split(/\s+/, $_); $count++; my $pos = $data[1]; my $cons1 = $data[2]; my $cons2 = $data[3]; my $cons3 = $data[4]; my $cons4 = $data[5]; my $cons5 = $data[6]; if($count < 5) { if($cons1 > $largest_cons1) { $largest_cons1 = $cons1; } if($cons2 > $largest_cons2) { $largest_cons2 = $cons2; } if($cons3 > $largest_cons3) { $largest_cons3 = $cons3; } if($cons4 > $largest_cons4) { $largest_cons4 = $cons4; } if($cons5 > $largest_cons5) { $largest_cons5 = $cons5; } } print "$pos $largest_cons1 $largest_cons2 $largest_cons3 $largest_ +cons4 $largest_cons5\n"; }
Any help/suggestions much appreciated. Thanks in advance!

Replies are listed 'Best First'.
Re: creating and printing a sliding window
by johngg (Canon) on Mar 04, 2009 at 15:23 UTC

    Have a look at the List::Util core module, particularly the max() routine. Also be aware the array subscripts are zero-based so your

    ... my $pos = $data[1]; my $cons1 = $data[2]; my $cons2 = $data[3]; my $cons3 = $data[4]; my $cons4 = $data[5]; my $cons5 = $data[6]; ...

    will be pointing one element too far to the right. You can also do that in one fell swoop.

    while( <FILE> ) { my( $pos, $cons1, $cons2, $cons3, $cons4, $cons5 ) = split; ...

    The default action for split is to split $_ on whitespace.

    I hope this is helpful.

    Cheers,

    JohnGG

Re: creating and printing a sliding window
by shmem (Chancellor) on Mar 04, 2009 at 15:30 UTC

    Close.

    if($count < 5) { ... } else { print "$pos ... \n"; $count = 0; @data = (); }

    but you would need to reset $largest_cons<n> too, depending on your requirements.

    Some points:

    • you use English, FileHandle and Exception, but then you don't make use of them. Why?
    • Why those $largest_cons<number> variables? Wherever you are inclined to number your variables, you really want an array
    • at open(FILE, "$input") use three-argument open
    • at open(FILE, "$input") - useless use of quotes

    You could also use $. (see perlvar) in a flip-flop (".." - see perlop):

    #!/usr/bin/perl -w use strict; use warnings; my $input = shift; #open FILE, '<', "$input" or die "ERROR: Unable to open input file: $! +\n"; my @largest_cons = (0) x 6; # inhibit "uninitialized" warnings while (<DATA>) { my @data = split; # if ( 1 .. 5 ) # see update below # { $largest_cons[0] = $data[0] if $. == 1; for (1..$#data) { $largest_cons[$_] = $data[$_] if $data[$_] > $largest_cons[$_]; } if ($. == 5) { $largest_cons[0] .= '-' . $data[0]; print "@largest_cons\n"; $. = 0; @largest_cons = (0) x 6; } # } }

    Note that starting with 1, you end at 1-5 .. 11-15 rather than 10-14. For that you need a row 0.

    Update: on a second look at the code I've posted, the flip-flop-business doesn't make sense here... ;)

Re: creating and printing a sliding window
by repellent (Priest) on Mar 04, 2009 at 18:48 UTC
    I would stick with the while loop as it is more scalable and make use of List::Util.
    use warnings; use strict; use List::Util qw(max); my @window; while (<DATA>) { chomp(); # create sliding window push(@window, [ (split) ]); shift(@window) if $. > 5; # print range print $window[0][0], "-", $window[-1][0]; # print maximums for my $i (1 .. 5) { print " ", max(map { $_->[$i] } @window); } print "\n"; } __END__ 1 0 0.00 0 0 0 2 0 0.00 0 0 0 3 0 0.08 0 0 0 4 0 0.05 0 0 0 5 0 0.08 0 0 0 6 0 0.05 0 0.12 0 7 0 0.05 0 0.12 0 8 0 0.04 0 0.15 0 9 0.07 0.07 0 0.15 0.18 10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58 15 0.39 0.05 0.41 0.00 0.37

    Output:
    1-1 0 0.00 0 0 0 1-2 0 0.00 0 0 0 1-3 0 0.08 0 0 0 1-4 0 0.08 0 0 0 1-5 0 0.08 0 0 0 2-6 0 0.08 0 0.12 0 3-7 0 0.08 0 0.12 0 4-8 0 0.08 0 0.15 0 5-9 0.07 0.08 0 0.15 0.18 6-10 0.29 0.07 0.32 0.32 0.19 7-11 0.46 0.07 0.42 0.32 0.21 8-12 0.46 0.07 0.42 0.32 0.41 9-13 0.57 0.07 0.42 0.32 0.47 10-14 0.57 0.07 0.62 0.32 0.58 11-15 0.57 0.07 0.62 0.30 0.58
Re: creating and printing a sliding window
by Utilitarian (Vicar) on Mar 04, 2009 at 15:19 UTC
    Slurping the file into an array would be more efficient.

    hints below

    ... open FILE, "<$ARGV[0]"; @data_records=<FILE>; for ($x=0;$x<@data_records;$x++){ for ($index=$x; $index<($x+5);$index++){ @record=split(/\s+/, $data_records[$index]); for($index=1;$index<@record;$index++){ $max[$index]=$record[$index] if $max[$index]<$record[$index]; } } print "$ARGV[0]-",$ARGV[0]+5,"@max\n"; }
    EDIT, Re-read now trying to answer the question actually posed

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://748175]
Approved by Fletch
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-04-23 14:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found