Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: efficient method of matching a string against a list of substrings?

by ihb (Deacon)
on May 05, 2005 at 01:08 UTC ( [id://454195] : note . print w/replies, xml ) Need Help??


in reply to efficient method of matching a string against a list of substrings?

Check out Regexp::PreSuf and Regexp::Assemble. They combine lists of words or patterns into a hopefully more efficient pattern than a trivial ORed expression.

ihb

See perltoc if you don't know which perldoc to read!

  • Comment on Re: efficient method of matching a string against a list of substrings?

Replies are listed 'Best First'.
Re^2: efficient method of matching a string against a list of substrings?
by diggler (Initiate) on May 05, 2005 at 01:35 UTC
    as noted in Regexp:Assemble:
    You should realise that large numbers of alternations are processed in perl's regular expression engine in O(n) time, not O(1). If you are still having performance problems, you should look at using a trie.
    regexes are often more convenient, rather than efficient. but thanks for pointing me there, because the note also mentions:
    ... Perl's own regular expression engine will implement trie optimisations in perl 5.10 (they are already available in perl 5.9.3 if you want to try them out).
    this would probably solve the problem in the future, but i need to find something for the present.

      For the present you can/should use R::A or one of the other regex modules as they should be patched in time of 5.10 to utilize the TRIE optimisation if possible. OTOH You havent really spelled out if you mean an anchored match or not. Ie is it /^(list|of|strings)$/ or is it /^(list|of|strings)/ or is it /(list|of|strings)/? Depending on which there may be non regex based strategies that are quite competitive.

      ---
      $world=~s/war/peace/g