efficient way of searching though large number of text file in a given directory

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I am about to write a script that takes a particular 'identifier' - just a piece of text really taken from the command line and then I want to go though a number of files within a particular directory and open them up one at a time.
Each of these files would then searched for the one that contains this 'identifier' and then print out the name of that file to the screen.
I'm aware, however, that opening a large number of files once at a time and then searching though them a line at a time just to hunt out a piece of text might be a slow and memory hungry method.
Can anyone think of a sensible approach to writing this script? Any suggestions much appreciated

Comment on efficient way of searching though large number of text file in a given directory

Replies are listed 'Best First'.
Re: efficient way of searching though large number of text file in a given directory by moritz (Cardinal) on Dec 01, 2010 at 16:13 UTC
I'd try an existing tool like grep, which is probably faster than Perl. `$ grep -l YourString directory/*` [download] Perl 6 - second systems done right	[reply] [d/l]
Re^2: efficient way of searching though large number of text file in a given directory by QM (Parson) on Dec 02, 2010 at 17:03 UTC
Agreed. Note that egrep is faster than grep. -QM -- Quantum Mechanics: The dreams stuff is made of	[reply]
Re: efficient way of searching though large number of text file in a given directory by eff_i_g (Curate) on Dec 01, 2010 at 17:28 UTC
Use `fgrep` (fast grep) if you only need to search for a string (not a pattern).	[reply] [d/l]
Re: efficient way of searching though large number of text file in a given directory by Anonymous Monk on Dec 01, 2010 at 16:32 UTC
You can also use `grep`, using e.g. the `qx//` feature, and read that output as the way of identifying those files that you might need to explore further. You can do a lot of useful things in a Unix-shell environment by “piping” together these very useful commands: `grep xargs awk perl (of course)` I put `perl` in that list, not for humor’s sake, but to point out that Perl doesn’t have to be front-and-center in whatever approach you take. You can get a lot of very useful work done very fast by “stitching together” existing tools, sometimes eliminating the need to write a single program to do it.	[reply]
Re: efficient way of searching though large number of text file in a given directory by Anonymous Monk on Dec 01, 2010 at 16:39 UTC
ack/App::Ack or grep	[reply]
Re: efficient way of searching though large number of text file in a given directory by pemungkah (Priest) on Dec 02, 2010 at 01:02 UTC
And adding a combination of caching and Linux::Inotify2 would let you get a nice additional speed gain (plus let you know when to invalidate the cache), assuming repeated searches for the same string are going to be happening. You might even be able to quickly re-prime the cache if the underlying files change infrequently by hanging onto the N most-commonly-repeated searches, redoing these, and then reloading them into the cache in the background.	[reply]


Perl-Sensitive Sunglasses
	PerlMonks