in reply to Speeding up point-in-polygon -- take two

For example, instead of storing the polygons as they're defined, break the polygons into horizontal strips with
with `P[0]`=lower-left, `P[1]`=upper-left, `P[2]`=upper-right, and `P[3]`=lower-right. In the case that the top or bottom edge is a point, you'll have the same thing, where `P[1]==P[2]` or `P[3]==P[0]`, respectively. Then you'll insert these simpler polygons (quadrilaterals, nominally) into the database, so you can use a simpler test. Thus, we break this polygon:

*--------------------* \ A / * *-----* / / \ \ / / \ \ / *--------* \ / *

into these three:

Since we have quadrilaterals (and triangles, sort-of), we can simplify your subroutine into something resembling:*--------------------* \ A / *--*-----*-------* / \ \ A(3)/ / A(2) \ \ / *--------* \ / *

sub _pointIsInQuadrilateral { my ($a_point, $a_x, $a_y) = @_; # point coords my ($x, $y) = ($a_point->[0], $a_point->[1]); # poly coords # $n is the number of points in polygon. my @x = @$a_x; # Even indices: x-coordinates. my @y = @$a_y; # Odd indices: y-coordinates. if ($x < ($x[1]-$x[0]) * ($y-$y[0]) / ($y[1]-$y[0] + $x[0]) ) { # $x is left of the leftmost edge, so it's not in polygon return 0; } if ($x < ($x[3]-$x[2]) * ($y-$y[2]) / ($y[3]-$y[2] + $x[2]) ) { # $x is right of the rightmost edge, so it's not in polygon return 0; } # Point is inside quadrilateral! return 1; }

As you can see, by redefining the problem and using quadrilaterals with parallel top and bottom edges, we get the following benefits:

`$n`)

So we're simplifying the code by increasing our polygon count. Even with the increased number of shapes in your table, I'm guessing that select statement will probably be about the same speed. Yes, there'll be more polygons, but they'll have smaller bounding boxes, so there'll be fewer false hits. I also think (without testing!) that `_pointIsInQuadrilateral` should be considerably faster.

This is only one way you could redefine the problem. After studying, you might find a different way to redefine the problem to speed things up. If you try this approach, please let me know how it turns out!

**Note:** One other advantage is that with a fixed number of points per polygon, you can store the point coordinates in table directly as numbers, rather than having a list of coordinates in text. Omitting the parsing of text to numbers may provide another speedup....

--roboticus