Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: Trimming hash based on element uniqueness

by BrowserUk (Patriarch)
on Jun 18, 2008 at 07:33 UTC ( [id://692645]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Trimming hash based on element uniqueness
in thread Trimming hash based on element uniqueness

Okay. I see what you mean. If there are a few dozen or hundred select * from tableX where id == '...'; with only the '...' varying, you might want to reduce them to one generic select and do a local hash lookup instead of issuing a query for each.

Quite frankly, this is one of those situations where I'd print all the selects to a file, one per line, load them into my editor, sort them. And then step through them manually breaking them into groups. Even with a few thousand to process, doing this manually would probably be far quicker than deriving a heuristic, coding and algorithm and testing it. And, it does seem like a process one would have to repeat frequently enough (or at all for any given application, if you did it right the first time), to consider automating.

I suppose if this were an application where the queries issued are derived and constructed from user input, then you might need to repeat the process a few times. Even then, doing a sort & group process manually, and then inspecting each group looking for caching opportunities and a heuristic to recognise each one is probably far simpler than trying to come up with a totally generic mechanism.

Were I tackling this, in addition to the 1 per line list of the queries, I'd also try an obtain/generate a "sample output" from each one. That is, a representation of the output each select produces in terms of the table/field pairing and number of rows produced, rather than the actual values:

(table.field1, table.field3, table.field7) x 60

It might then be possible to prefix this information to queries themselves prior to sorting and use that to aid in the grouping. I'd still derive the generic query manually the first few times, and if the caching is successful, the numbers of long running queries should be very quickly reduced by the caching process, which should mean that it becomes redundant to automate it.

Probably the hardest part would be recognising which queries (at runtime) can be redirected to the cache, but manually deriving a regex or two to recognise the pattern(s) of queries in each group replaced by a generic query, would be fairly easily done manually I think.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://692645]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (6)
As of 2024-03-29 13:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found