Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Does DBI handle table alias?

by punch_card_don (Curate)
on Oct 06, 2004 at 18:16 UTC ( #397122=note: print w/replies, xml ) Need Help??


in reply to Does DBI handle table alias?

Thanks, all, for the help & direction.

I'm trying to put it into practice. Here's the situation:

I have ~20,000 users, each reporting an average of 600 characteristics out of a possible 3,000 characteristics. The objective of this system is to count users who correspond to various collections of characteristics, in the general form:

count all users who reported characteristics a, b &c

I've created a two-column databse that looks like this:

characteristic_id (int), user_id(int)

with no indexes to start, as a baseline. So, for the average user, there are 600 rows, that makes 12-million rows in total.

Following your suggestions, I connected by telnet to execute a few queries.

select count(*) from theTable
correctly reports 12,000,000 rows in 0.00s

select count(*) from theTable where characteristic_id = 123
reports in 0.02s

At this point I'm thinking this is really great. But then

select count(*) from theTable where user_id = 456
reports in 10.77s

And then applying VSarkiss' excellent style:

select count(*) from theTable A inner join theTable B on A.user_id = B.user_id where A.characteristic_id = 123 and B.characteristic_id = 124
reports in 3min 23s!

So - in effect, nothing to do with Perl.

But a new problem. It takes several dozen such queries to compile the report data I need. Clearly 3-minutes per query is unacceptable.

Personally, I'm amazed by the magnitude of the increase in response time.

Sooooooo - dare I hope to see a two-orders-of-magnitude decrease in response time by adding indexes? Or do you think my database design is fubar?

Replies are listed 'Best First'.
Re^2: Does DBI handle table alias?
by The_Rabbit (Acolyte) on Oct 06, 2004 at 19:18 UTC
    You should investigate what kind query analysis tools your DBMS has. There should be a tool that will tell you what type of execution path your DBMS is planning for your query. Generally, these will help you determine where indexes are useful.

    If you have absolutely no indexes then for every query you are doing a full table scan instead of an index scan. So, adding a n index will improve performance dramatically. Most likely by at LEAST an order of magnitude and possibly even more than that.

      You should investigate what kind query analysis tools your DBMS has.

      In particular, check your RDBMS docs for "EXPLAIN PLAN". It's how you get the database to cough up the query execution plan, which will tell you things like where the database is having to do full table scans to satisfy your query. This'll give you suggestions about where indexes might help.

Re^2: Does DBI handle table alias?
by dragonchild (Archbishop) on Oct 06, 2004 at 18:23 UTC
    If you add indices, you will see response times drop from 3min23s to under 10s, and probably closer to 1s. I have seen it happen. In other words, try it and see!

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://397122]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2022-08-12 06:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?