As I commented over in Building Regex Alternations Dynamically, it's possible to generate a regex from the entirety of /usr/share/dict/words, which on my system currently has over 100,000 entries, resulting in a regex that has a string length of 1MB. Matching against that regex is still relatively performant. So building a regex in the way you showed is possible; whether it's the best solution in your case probably depends on how many matches you'll be doing with that regex, and you'll have to measure the performance in your use case. I would recommend that loadCommonWords should return a regex precompiled with qr// instead of a string, and that you sort @commonwords by length, as I showed in the aforementioned thread.
Update: Eily is right, I overlooked the anchors: for exact string matches, definitely use a hash instead.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|