http://qs321.pair.com?node_id=887307


in reply to Re: search a large text file
in thread search a large text file

I suspect that KinoSearch would work about as well as a database like SQLite or PostgreSQL for this. It's actually a decent conceptual match -- inverted indexers like KinoSearch, Lucene, Xapian, etc. are optimized for many reads and fewer inserts, as opposed to the typical B-tree indexes on databases which handle inserts a little better. The only thing that's odd is that the original poster doesn't seem to need the relevance-based ranking that inverted indexes do well.

Regardless, the problem is straightforward and there are lots of good options for solving it.

Replies are listed 'Best First'.
Re^3: search a large text file
by erix (Prior) on Feb 10, 2011 at 13:46 UTC

    PostgreSQL does indeed have btree indexes, but also inverted indexes (GIN), and the excellent GIST index type. (it seems to me the btree type does well enough in this case; if you see my example below, where searching in a 223-million+ rows table takes a tenth of a millisecond).

    PostgreSQL index-type docs here.

    I'm just reacting to the juxtaposition of sqlite and postgres; really: SQLite, handy as it often is, can not be compared with a powerful database system like postgresql.

    (And I should really try & compare Your Mother's example with KinoSearch, and see if he is right; maybe in the weekend... )