comment on

But I'm not clear on how we went from talking about 250milliseconds-per-query in paragraph 2, then 250 queries-per- millisecond in paragraph 4

Your right. The post was written in two stages. Originally it was based on a few lines of code that I threw togther to test the idea out. No subroutines (or their overhead). Only positive match detection. Much smaller datasets. It worked and I starting writing the post on that basis. Then I realised that it was way too limited in the types of questions it could answer and the hard coded scale of the test was limiting, so I went back and improved things.

The numbers in paragraph 4 are leftovers from the original, artificially simpler, but faster tests. I will update the node.

As an aside, the same technique can be applied even to datasets where the answers are not yes/no, provided the range of answers can be reduced to a reasonable range of discrete values. Ie. multiple choice as you are doing.

All too often you see applications storing things like dates, ages & other forms of continuously variable numeric values, when all that is really required for the application is "Under 18 | over 18 and under 65 | over 65" and similar, that can be easily replaced by an enumeration. Many DBs can handle these quite efficiently.

Unfortunately, they also tend to apply arbitrary limits to the size of an enumeration, 32 or 64 etc. The shame is that in many cases, the limits for the number of indexes that may be applied to a given table (MySQL:32, DB2:(was)255), coincide. In many cases, the use of large enumeration types could substitute for large numbers of indexes with conciderable better efficiency. They can also be used to remove the need for foriegn keys in some cases, for another hike in efficiency.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

In reply to Re^2: Basic Perl trumps DBI? Or my poor DB design? by BrowserUk
in thread Basic Perl trumps DBI? Or my poor DB design? by punch_card_don

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Clear questions and runnable code get the best and fastest answer
	PerlMonks