Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: speeding up a file-based text search

by Thelonius (Priest)
on May 06, 2003 at 18:19 UTC ( #255977=note: print w/replies, xml ) Need Help??


in reply to speeding up a file-based text search

You should see if you can get some kind of more sophisticated indexing system. I don't remember if Glimpse speeds up within-file sorts, but if it does you could use it with "agrep". (Google(TM) it).

I haven't worked with the module Search::InvertedIndex, but you could still use it, or a similar approach. You need to keep a list of all the indexed words so that you can do a fast serial scan over it (I don't know if Search::InvertedIndex will allow this) and see which of these your pattern matches. Then you look those up in the InvertedIndex to get the list of actual matches. You should probably do a merge/sort of all the matches before you retrieve them from the actual data file.

  • Comment on Re: speeding up a file-based text search

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://255977]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2020-10-28 02:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (259 votes). Check out past polls.

    Notices?