http://qs321.pair.com?node_id=1159880


in reply to Re^6: How would you code this?
in thread How would you code this?

Most informative, particularly the image. I'd definitely chalk this up as artifacts of the control software -- station keeping is noisy. I could ask some colleagues who would likely have some pretty wonky comments about the system, but I don't think deeper physics would really impact the signal processing (deeper physics available upon request).

The problem statement as I now interpret it is generate a strictly monotonic series of X values while eliminating the smallest number of points. Agree given the presented dataset that dropping points is more appropriate than a smoothing. The controller overshot, overcompensated, and ultimately returned to the correct number. I would read this as wanting to drop the whole vee, and if graff's observation is correct, keep the top corner point and drop the lower corner point. I would implement this as

#! perl -slw use strict; use 5.10.0; printf "%s", scalar <DATA>; my @points = map[ split ' ', $_ ], <DATA>; my @modified; push @modified, shift @points; # Kickstart while ( @points ) { my $cmp = ($points[0][0] <=> $modified[-1][0])*($points[0][1] > $m +odified[-1][1]); if ($cmp == 1) { push @modified, shift @points; } elsif ($cmp == -1) { shift @points; } else { pop @modified; } }
which underperforms on the keep-the-most-points metric. Note the y-value switch so you can descend the loop with the same code. It also consistently deviates low, since it presumes that only the part after the first regression of the curve is wrong. Similar results are obtained with
while ( @points ) { if (($points[0][0] > $modified[-1][0])^($points[0][1] < $modified[ +-1][1])) { push @modified, shift @points; } else { shift @points; pop @modified; } }
which drops a slightly larger number of points, and I think has a slightly cleaner curve because of it. My best idea is that we are trying to correct for S's in the data. For minimal impact to the curve, we probably want to hold onto the start of the S, the end of the S, and the median of the retrograde component. Note that by incorporating the median, we may be adding a fictitious point.
while ( @points ) { if (($points[0][0]-$modified[-1][0])/($points[0][1]-$modified[-1][ +1]) > 0) { push @modified, shift @points; } else { # Retrograde! my @retro = (pop @modified, shift @points); push @retro, shift @points while ($retro[-1][0]-$points[0][0]) +/($retro[-1][1]-$points[0][1]) < 0; shift @points while ($retro[0][0]-$points[0][0])/($retro[0][1] +-$points[0][1]) < 0; pop @modified while ($retro[-1][0]-$modified[-1][0])/($retro[- +1][1]-$modified[-1][1]) < 0; push @modified, [0.5*($retro[@retro/2-.5][0] + $retro[@retro/2 +][0]), 0.5*($retro[@retro/2-.5][1] + $retro[@retro/2 +][1]), ] } }
With this, I finally beat graff's point count, though one might argue his result is a little better on the wiggly bits since I bias high.

I realize that the above implementations are written in a very Perly way, but they can certainly be changed to more traditional structures after a best algorithm has been suggested.

Finally, I should note that given the characteristic uncertainties of these measurements, this whole discussion is academic since none of these curves are significantly different from each other from a measurement perspective.


#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Replies are listed 'Best First'.
Re^8: How would you code this?
by BrowserUk (Patriarch) on Apr 08, 2016 at 19:24 UTC

    Kenneth, thanks for your continued explorations. I've been looking at this most of the day, and I think I finally made a decision.

    Looking in detail at the differences between the original data (red), my original algorithm (blue); graff's agorithm (black) and your latest algorithm (green); there is one place where I think your algorithm is clearly superior: the bottom left corner of the top right zoom, where your algorithm splits the difference between mine and graff's; and I think is clearly a better reflection of the underlying trend of the original data.

    However, as you pointed out, it does 'invent' points, which neither of the other two do. I also agree that in most other cases yours has a tendency to bias high.

    So after much umming and aahing, I'm going with graffs algorithm. Thanks.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.