http://qs321.pair.com?node_id=1224779


in reply to Re^2: searching polygons not merged
in thread searching polygons not merged

To illustrate polygon (a standard postgresql data type), polygon-comparison (here: overlap), and polygon-indexing (with which my practical experience is pretty much zero - caveat emptor!), I lifted some sql from the standard regression tests in the postgres source tree ( src/test/regress/sql/polygon.sql ), and messed about with it a bit, and added comments:

Note: the postgres polygon overlap operator is &&

#!/bin/bash echo " drop table if exists quad_poly_tbl ; create table quad_poly_tbl (id int, p polygon); insert into quad_poly_tbl select (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + +(x + y) % 10)) from generate_series(1, 100) x, generate_series(1, 100) y ; insert into quad_poly_tbl select i, polygon '((200, 300),(210, 310),(230, 290))' from generate_series(10001, 11000) AS i ; analyze quad_poly_tbl; -- search for overlap with this polygon: select * from quad_poly_tbl where p && '((22,640),(23.0717967697245,64 +4),(26,646.928203230275),(30,648),(34,646.928203230275),(36.928203230 +2755,644),(38,640))'::polygon ; --> Time: 1.382 ms -- Seq Scan on quad_poly_tbl (3 MB) -- now add index: create index quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p); -- search again for overlap with this polygon but now WITH the polygon +-index present: select * from quad_poly_tbl where p && '((22,640),(23.0717967697245,64 +4),(26,646.928203230275),(30,648),(34,646.928203230275),(36.928203230 +2755,644),(38,640))'::polygon ; --> Time: 0.271 ms -- Bitmap Index Scan on quad_poly_tbl_idx (1 MB) " | psql -qa

So the difference between seqscan and polygon-index in this test (searching for 6 matching rows in a table of 11000 rows) is:

--> Time: 1.382 ms -- Seq Scan on quad_poly_tbl (3 MB) --> Time: 0.271 ms -- Bitmap Index Scan on quad_poly_tbl_idx (1 MB)

For the OP's question, of course, loading data into the database, etc., should be taken into account.

Replies are listed 'Best First'.
Re^4: searching polygons not merged
by LanX (Saint) on Oct 28, 2018 at 01:08 UTC

      You link to the documentation of an old postgres version (9.2) that is not supported anymore (I know that google often puts these old links to the documentation at the top -- annoying).

      But in this case there is an important distinction: recent postgres has default implementations of sp-gist. So the 'limitations' when implementing your own sp-gist class, while still true, need not worry us here. You will see that newer docs have a page "Built-in Operator Classes". My example uses the built-in sp-gist class. The 'limitations' are just warnings for those who implement their own sp-gist class.

      It would be interesting to have an example dataset to compare the performance of any forthcoming pure-perl solution with what postgres indexing can do. Or graphic libraries (like you mentioned earlier).

        > It would be interesting to have an example dataset to compare the performance of any forthcoming pure-perl solution with what postgres indexing can do. Or graphic libraries (like you mentioned earlier).

        I'm trying not to cross that road. ;)

        If you want you can compare it against a trivial bounding-box approach of spatial indexing.

        You add 4 indexed columns for each polygon and use the formula from my first post to eliminate candidates°.

        A simplistic pure Perl solution would probably imply sorted arrays, slices and hash intersections for the AND clause.

        (Doesn't scale well but can handle 10000 polygons easily)

        You already created your test sample, I doubt the OP ever will.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        °) Nota bene: this doesn't calculate overlaps, it's just a poor man's spatial index to eliminate impossible combinations.