Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Re: Is there a good way of retrieving entries randomly from a database (MySQL perhaps)?

by sgifford (Prior)
on Jan 04, 2004 at 08:17 UTC ( [id://318629]=note: print w/replies, xml ) Need Help??


in reply to Re: Is there a good way of retrieving entries randomly from a database (MySQL perhaps)?
in thread Is there a good way of retrieving entries randomly from a database (MySQL perhaps)?

I used the third technique in a recent project, but instead of using the primary key (which was a non-sequential ID number), I added a rownum column which gave each row a sequential number. I then selected a random value between 1 and the number of rows.

The reason this is much faster than the ORDER BY rand() technique is that (AFAIK) in order to resolve ORDER BY rand(), MySQL has to go through each row, generate a random number, then see which one is the lowest. If the table is very large, that takes a long time.

The problem with using a row number is that if the data changes, you can end up gaps with in numbering. If the data is generated periodically, there's no problem; just regenerate the row numbers with the rest of the data. Periodically, you can regenerate the row numbers to fill in gaps. If there's just a few missing numbers scattered here and there and you don't need great randomness, it's no big deal; instead of using WHERE id = $rand you'd use WHERE id >= $rand LIMIT 1. If you have a lot of changes, though, this technique will work poorly.

Something that might work a little better with a large number of changes would be to randomly assign each record a number between, say, 1 and 100. To choose a random record, then, you would first pick a number between 1 and 100, then you would get a record randomly from the set of rows assigned that number with ORDER BY rand(). It would still have to evaluate multiple records, but on average it would only be 1/100 as many records as using ORDER BY rand() on the whole table.

  • Comment on Re: Re: Is there a good way of retrieving entries randomly from a database (MySQL perhaps)?

Replies are listed 'Best First'.
Re: Re: Re: Is there a good way of retrieving entries randomly from a database (MySQL perhaps)?
by soon_j (Scribe) on Jan 04, 2004 at 08:56 UTC
    That's a good suggestion! I believe you are right, it's a lot faster.

    Now I am left with one problem. In order to implement your suggestion, I only need to issue a single query from my questions table in the database. PLEASE SEE MY RESPONSE TO GMAX FOR AN IDEA.

    Thanks a lot! Jay
      To make this technique work with questions grouped into subjects, you'd want to assign a per-subject question number, then for each subject pick a random number between 1 and the number of questions in that subject to find the record.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://318629]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-25 21:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found