Re^2: Compare fields in a file

Alright, alright. Settle down. The script:

#!/usr/bin/perl

#-----------------------------pointset1----------------------
$points=$ARGV[0];
$cnt=0;
open(PTS,"<$points");
while(<PTS>) {
#X,Y,Z,Time,Amplitude
++$cnt;
  if(!/X/) {
    ($x,$y,$z,$ts,$a)=split ',';
    ($date,$time)=split ' ',$ts;
    $ts="$date $time";
    ($h,$m,$s)=split ':', $time;
    ($d,$mon,$yr)=split '-', $date;
    $date="$d-$mon-$yr";
    $t=($h*3600)+($m*60)+$s;#convert to seconds

    push(@ts,$t);
    push(@as,$a);
    push(@lines,$_);
  #$data[$cnt][0]=$t;
  #$data[$cnt][1]=$a;
  #$data[$cnt][2]=$_;

#print STDERR "\n$_";
#print STDERR "Event #: $cnt\n";
#print STDERR "Seconds: $t\n";
#print STDERR "Amplitude: $a\n";
    }
}
close(PTS);


while (@ts != 0) {
    $t0=pop(@ts);
    $a0=pop(@as);
    $line0=pop(@lines);


    
        # sort by amplitude smallest to largest
        #@data=sort {$a->[1]<=>$b->[1];} @data;
    #($t0,$a0,$line0) = pop(@data);
    
    #print STDERR "T0=$t0\n";
    #print STDERR "A0=$a0\n";
    #print STDERR "$line0\n";

        # sort by time difference smallest to largest
    #@data=sort {abs($t0-$data[$a][0])<=>abs($t0-$data[$b][0])} @data;

   $flag=0;
    for ($i=0;$i<@ts;$i++){
            
            $test = abs($t0-$ts[$i]);

            #print "NEXT Time= $ts[$i]\n";
            #print "NEXT Amp= $as[$i]\n";
            #print "Time Difference= $test\n";


            #if ($test < 0.3 && $a0 < $as[$i]){print "TIME: $test is l
+ess than 0.3\n";
            #print "AMPLITUDE:$a0 is less than $as[$i]\n"}

            if( $test < 0.3 && $a0 < $as[$i]){$flag=1}
                #print "FLAG= $flag\n";
                  }
        if ($flag==0) {unshift(@keepers,$line0)};
        #push(@keepers,$line0);
}

open(OUT,">out_file.txt");
print OUT "X,Y,Z,Time,Amplitude\n";    
print OUT (@keepers);
close (OUT);
[download]

The input file:

X,Y,Z,Time,Amplitude
2550,531,66,10-12-2007 07:03:08.069,2
2549,529,62,10-12-2007 07:03:08.151,1
2550,531,66,10-12-2007 07:03:09.069,1
2549,529,62,10-12-2007 07:03:09.151,2
[download]

Current results:

X,Y,Z,Time,Amplitude
2550,531,66,10-12-2007 07:03:08.069,2
2550,531,66,10-12-2007 07:03:09.069,1
2549,529,62,10-12-2007 07:03:09.151,2
[download]

The current script only works if the largest amplitude appears first in time(default sort of the data). An attempt to get a bit more sophisticated (commented lines)wasn't working either. -honyok

Comment on Re^2: Compare fields in a file Select or Download Code

Replies are listed 'Best First'.
Re^3: Compare fields in a file by johngg (Canon) on Feb 10, 2009 at 18:55 UTC
Can it be safely assumed that the data file will already be in time order? If so, you can process the lines second by second, accumulating the lines until the to-the-second resolution time changes and then processing the accumulated lines to find the one with the largest amplitude. You do not say what you want to do when more than one line has the maximum amplitude. use strict; use warnings; # Skip headings line(s). my $discard = <DATA> for 1 .. 1; my $currentTimeStr = q{}; my @currentLines = (); while( <DATA> ) { my $timeStr = ( split m{,} )[ 3 ]; $timeStr =~ s{\..*}{}; if( $timeStr ne $currentTimeStr ) { processLines( @currentLines ) if @currentLines; $currentTimeStr = $timeStr; @currentLines = ( $_ ); } else { push @currentLines, $_; } } processLines( @currentLines ); sub processLines { my @sortedLines = map { $_->[ 0 ] } sort { $b->[ 1 ] <=> $a->[ 1 ] } map { [ $_, ( split m{,\|\n} )[ -1 ] ] } @_; print $sortedLines[ 0 ]; } __END__ X,Y,Z,Time,Amplitude 2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2 [download] The output. `2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:09.151,2` [download] I hope this is useful. Cheers, JohnGG	[reply] [d/l] [select]
Re^4: Compare fields in a file by Not_a_Number (Prior) on Feb 10, 2009 at 19:19 UTC
If I add a few events to the input, this breaks. Try it with these data: `2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2 2550,531,66,10-12-2007 07:03:10.001,6 2550,531,66,10-12-2007 07:03:11.099,7` [download] The output: `2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:09.151,2 2550,531,66,10-12-2007 07:03:10.001,6 2550,531,66,10-12-2007 07:03:11.099,7` [download] Ignore. I misread the specs.	[reply] [d/l] [select]
Re^5: Compare fields in a file by johngg (Canon) on Feb 10, 2009 at 19:57 UTC
Perhaps I've misunderstood the requirement but that output looks as I would have expected. Two events at time `07:03:08`, event with amplitude 2 chosen. Two events at time `07:03:09`, event with amplitude 2 chosen. A single event at time `07:03:10`, the only event (with amplitude 6) chosen. A single event at time `07:03:11`, the only event (with amplitude 7) chosen. So far as I can tell, that satisfies the requirement "I'd like to keep only the largest magnitude within each second" to the letter. Perhaps you could explain further in which way it is "broken." Cheers, JohnGG	[reply] [d/l] [select]
Re^6: Compare fields in a file by Not_a_Number (Prior) on Feb 10, 2009 at 21:41 UTC
Re^3: Compare fields in a file by toolic (Bishop) on Feb 10, 2009 at 18:23 UTC
Does this do what you want? It stuffs all the lines into a hash-of-hashes data structure. The primary key is the time, truncated to seconds. The secondary key is the magnitude. First, it sorts by time, then by magnitude, keeping only the largest magnitude. Update: This code needs to be adapted if the input file contains data for more than one day. `use strict; use warnings; my %mags; while (<DATA>) { next if /X/; chomp; my $pair = (split)[-1]; my ($time, $mag) = split /,/, $pair; $time =~ s/\..*//; $mags{$time}{$mag} = $_; } for my $time (sort keys %mags) { my $mag = (sort {$b <=> $a} keys %{ $mags{$time} })[0]; print "$mags{$time}{$mag}\n"; } __DATA__ X,Y,Z,Time,Amplitude 2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2` [download] This prints: `2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:09.151,2` [download]	[reply] [d/l] [select]
Re^4: Compare fields in a file by CountZero (Bishop) on Feb 10, 2009 at 20:29 UTC
Nice, but it does not take into account the date portion of the date-time. If you have several days of data it will lump together all the same times, not looking at the day. I grant you that it is not very clear from the OP's example whether his file can contain multiple days. I looked again at the program written by the OP and indeed he only uses the time element, so you were right! CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^5: Compare fields in a file by toolic (Bishop) on Feb 10, 2009 at 20:43 UTC
I couldn't decide whether it was relevant to mention this "feature" of my code. So, in my Laziness, I decided to omit that assumption. I think I'll update my node to be more explicit.	[reply]
Re^3: Compare fields in a file by Not_a_Number (Prior) on Feb 10, 2009 at 19:26 UTC
Try this: use strict; use warnings; my %biggest; while ( <DATA> ) { chomp; my @items = split /,/; my $coords = join ',', @items[ 0 .. 2 ]; my ( $time, $mag ) = @items[ 3, 4 ]; if ( not defined $biggest{$coords} or $mag > $biggest{$coords}->[1] +) { $biggest{$coords} = [ $time, $mag ]; } } for my $coords ( keys %biggest ) { print join( ',', $coords, join',', @{ $biggest{$coords}} ), "\n"; } __DATA__ 2550,531,66,10-12-2007 07:03:08.069,2 2549,529,62,10-12-2007 07:03:08.151,1 2550,531,66,10-12-2007 07:03:09.069,1 2549,529,62,10-12-2007 07:03:09.151,2 2550,531,66,10-12-2007 07:03:10.001,6 2550,531,66,10-12-2007 07:03:11.099,7 [download] Output: `2550,531,66,10-12-2007 07:03:11.099,7 2549,529,62,10-12-2007 07:03:09.151,2` [download]	[reply] [d/l] [select]
Re^4: Compare fields in a file by honyok (Sexton) on Feb 10, 2009 at 19:44 UTC
Thanks all. I'll try the suggestions and udate later. - honyok	[reply]


Perl-Sensitive Sunglasses
	PerlMonks