laziness, impatience, and hubris | |
PerlMonks |
RE: Substring Finding/Countingby nuance (Hermit) |
on Aug 30, 2000 at 20:49 UTC ( [id://30334]=note: print w/replies, xml ) | Need Help?? |
when you first mentioned this in the chatterbox you
asked if anyone had any suggestions to improve it.
So here goes FWIW.
You are writing your matches to a temporary file and then reading them in again to construct a hash. That means that if you find any substrings with more than one occurrence, you will end up with entries in your data file for all of them. For instance if you find that token you mentioned and it's in every line of your - lets say - 500 record file. Then your first mention in the text file says 500 occurrences, the next says 499 and so on down to 2 occurrences. You don't get an entry that says one, but you will have checked for it. If instead of writting to that file you created the hash as you process the file, then right at the top you can just check if it exists. If it does then dont bother checking any further, you've already found all these matches. For the example I gave this equates to leaving out 124750 checks and that's just for one pattern. like this:
Nuance
In Section
Code Catacombs
|
|