Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Seeking Algorithm

by tkil (Monk)
on May 28, 2004 at 07:18 UTC ( [id://357146]=note: print w/replies, xml ) Need Help??


in reply to Seeking Algorithm

Depending on how precise you want to be about equality, you might need to do a fancy floating-point compare:

my $EPSILON = 0.01; sub approx_equal ( $ $ ) { my ( $f1, $f2 ) = @_; return abs( $f1 - $f2 ) < $EPSILON; }

The most straigntforward filter-style code I could come up with:

while ( <> ) { my @in_range = grep { /^[\d.+-]+$/ && 296.1 <= $_ && $_ <= 314.0 } split /,\s*/; # skip records with only 305.1 (room temp kelvin?) in range next if @in_range == 1 && approx_equal( $in_range[0], 305.1 ); print; }

If this is too slow, it might be faster to bail out as soon as we know we can:

LINE: while ( my $line = <> ) { my $n_305_1 = 0; my $n_in_range = 0; foreach my $chunk ( split /,\s*/, $line ) { # skip non-numeric next unless $chunk =~ /^[\d.+-]+$/; # skip values outside the range next unless 296.1 <= $chunk && $chunk <= 314.0; # only allowed one value in the given range... if ( ++ $n_in_range >= 2 ) { print $line; next LINE; } # and we can only ignore one 305.1 if ( approx_equal( $chunk, 305.1 ) && ++$n_305_1 > 1 ) { print $line; next LINE; } } print $line unless $n_in_range == 1 && $n_305_1 == 1; if ( $n_in_range == 0) { warn "ill-formed line $.: no values in range"; } }

Replies are listed 'Best First'.
Re: Seeking Algorithm
by Abigail-II (Bishop) on May 28, 2004 at 07:39 UTC
    There's no need for an "approx_equal" to do fancy floating point compare. You get the numbers from a split - so you have strings. Just use string compare and avoid the uncertainy problems you have with floating point compare.

    Abigail

      There's no need for an "approx_equal" to do fancy floating point compare. You get the numbers from a split - so you have strings. Just use string compare and avoid the uncertainy problems you have with floating point compare.

      At one point in the evolution of my response, I was relying on strings to avoid this problem. Three reasons I switched:

      1. The original post didn't make it clear that all numbers would have exactly one decimal point. If the row contained 305.10, did that count? How about 305.099?
      2. I can never remember if scalars keep both their string and numeric natures at the same time. After using a numeric comparison against a given scalar, is it now just a number, or are both kept around? (I could find out by research or experimentation, but the fact that I had to think about it, even after 10+ years of using Perl, makes me think that I should avoid this subtlety.)
      3. Finally, it was a way to throw in an educational tidbit "for free". The original author didn't specify the example very precisely; by including this in my response, it would hopefully help them think about it more clearly. (And/or I was showing off. You decide.)
        For clarification, in my case there will never be more than two decimal points and the data is very precise in that there might be a 305.1 or a 310.10 but there would never be a 305.10 or 310.1 either. While I didn't need the complexity of floating decimals, someone in the future might have a similar-but-different problem where this will be important. The "free tidbit" is always a plus.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://357146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2024-04-19 14:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found