Re: speeding up a file-based text search

in reply to speeding up a file-based text search

How large would an inverted word list be? It might work better to try to match against a join(' ', @wordlist) (computed only once, obviously). Unless your data is highly random, for a document of 20MB that string probably won't exceed a few dozen kbytes (if it's even that large). Upon finding a match, pos, index, rindex, substr would serve to extract the full word the match landed on, which you can then look up in your inverted word list.

Regex::PreSuf would be of use to increase the pattern efficiency if it's still an issue.

Makeshifts last the longest.

In Section Seekers of Perl Wisdom