Ranking position in a SQL index

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Assuming a table with millions of rows containing an indexed numeric column with a wide range of values that change very frequently, what would be the most efficient way to determine what position or "rank" of any given row has based on the value in the indexed column?

Thanks for your replies.

Comment on Ranking position in a SQL index

Replies are listed 'Best First'.

Re: Ranking position in a SQL index
by jZed (Prior) on Oct 05, 2004 at 04:28 UTC

  SELECT COUNT($col)
    FROM $table
   WHERE $col <= $this_row_col_value
[download]

Or perhaps I misunderstand what you mean by "rank", if so, please explain.

[reply]
[d/l]

Re: Ranking position in a SQL index
by Zaxo (Archbishop) on Oct 05, 2004 at 03:43 UTC

It depends on what your dbms supports, but look at the ORDER BY and LIMIT clauses.

After Compline,
Zaxo

[reply]

Re^2: Ranking position in a SQL index

by Anonymous Monk on Oct 05, 2004 at 03:55 UTC

I know those clauses but I can't see how they would help me get the numeric rank of a given row or value efficiently.

[reply]

A reply falls below the community's threshold of quality. You may see it by logging in.

Re: Ranking position in a SQL index
by sgifford (Prior) on Oct 05, 2004 at 04:41 UTC

SELECT COUNT(*) FROM table WHERE numeric_column < 
  (SELECT numeric_column FROM table WHERE your_criteria)
[download]

In my version of mysql, subqueries aren't supported, so I actually have to save the subselect results into a variable, then do the second select, but EXPLAIN says this similar query is reasonably well-optimized:

mysql> explain select count(*) FROM faar_homes WHERE price < 80000 \G
*************************** 1. row ***************************
        table: faar_homes
         type: range
possible_keys: price
          key: price
      key_len: 5
          ref: NULL
         rows: 25
        Extra: where used; Using index
[download]

[reply]
[d/l]
[select]

Re^2: Ranking position in a SQL index

by pg (Canon) on Oct 05, 2004 at 04:53 UTC

"SELECT COUNT(*) FROM table WHERE numeric_column < 
  (SELECT numeric_column FROM table WHERE your_criteria)"
[download]

I believe you knew, but just to make it complete. The criteria has to make sure that only one row is returned from the subquery. Otherwise you would get error like 'too many rows returned'.

[reply]
[d/l]

Re: Ranking position in a SQL index
by superfrink (Curate) on Oct 05, 2004 at 04:02 UTC

[reply]

Re^2: Ranking position in a SQL index

by pg (Canon) on Oct 05, 2004 at 04:13 UTC

Having is kind of like where, but only works with group by. It allows you to pick up groups that meets the criterias defined in having.

But having does not give you order, nor does group by. Although most of the DBMS systems actually do sorting for group by, it is not required by SQL standard, so unless it is clearly stated in document, don't assume it does the ordering.

For example, oracle does not always sort group by, but sometime it sorts. You will see this, when you do explain. Oracle will tell you whether it is 'sorted group by' or just a 'group by'. But this is not controled by you, so in Oracle, you still have to order by after group by, to ensure the ordering is really there. This does not degrade the performance, as Oracle will skip the order by silently, if the group by is 'sorted'.

[reply]

Re^2: Ranking position in a SQL index

by Anonymous Monk on Oct 05, 2004 at 04:07 UTC

Well I think in the worst case I could confine the problem to an O(log N) binary search via the LIMIT clause, but I'm hoping for an O(1) solution.

[reply]

Re: Ranking position in a SQL index
by Plankton (Vicar) on Oct 05, 2004 at 05:09 UTC

select what, ever, rownum RANK from something where rownum between 123
+ and 321;
[download]

Plankton: 1% Evil, 99% Hot Gas.

[reply]
[d/l]

Re^2: Ranking position in a SQL index

by jplindstrom (Monsignor) on Oct 05, 2004 at 11:23 UTC

Tom has a good explanation of what it does. It could be useful in the original poster's case though.

[reply]

Re: Ranking position in a SQL index
by TedPride (Priest) on Oct 05, 2004 at 10:10 UTC

SELECT idn FROM tablename WHERE index = 123 LIMIT 1;

Don't forget the LIMIT part. It makes the search more efficient. Note also that idn should be a primary key auto-increment, so it returns the row number assuming you don't delete any rows (better to flag rows for delete and then replace with fresh data later on).

On the other hand, if you're just doing this from a text file, you can set up a secondary file with all the index values and line numbers in order by index, and then perform a binary search to find the line number you want. Then the line number can be used to access the actual data in the other file.

[reply]
[d/l]

Re^2: Ranking position in a SQL index

by demerphq (Chancellor) on Oct 05, 2004 at 17:29 UTC

Just to clarify something here. The 'limit' clause has no effect on the efficiency of the search. It affects the overall efficiency of the query because it prevents you from accidentally requesting more rows than you can handle.

In the case where there is a unique index on the field (which is essentially what a PK-identity is anyway) the limit is entirely superlfuous and probably actually slows things down just marginally as its more to be marshalled accross the interprocess or intermachine barrier. Also to the best of my knowledge 'limit' is not standard SQL but rather a MySQL extension.

Anyway. ;-)

---
demerphq

_{First they ignore you, then they laugh at you, then they fight you, then you win.

-- Gandhi

Flux8}

[reply]

Re^3: Ranking position in a SQL index

by dragonchild (Archbishop) on Oct 05, 2004 at 17:36 UTC

Oracle does have something similar, but it's harder to do.

Being right, does not endow the right to be rude; politeness costs nothing.
Being unknowing, is not the same as being stupid.
Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

[reply]

Re^4: Ranking position in a SQL index (vairants of limit:sybase)

by demerphq (Chancellor) on Oct 05, 2004 at 17:42 UTC

Re^5: Ranking position in a SQL index (vairants of limit:sybase)

by mpeppler (Vicar) on Oct 05, 2004 at 19:35 UTC

Re: Ranking position in a SQL index
by Anonymous Monk on Oct 06, 2004 at 20:00 UTC

Here is a way to do it. SQL is a lot like Perl in the TMTOWTDI. The code joins the table on itself which is usually not a very good performer when you have millions of rows. The nice thing about this example is that it's SQL ANSI 92 standard so it should run on all compliant dBs and it gives you the result you want.

CREATE TABLE Abc(letter char(1), position int)
INSERT INTO Abc (letter, position) VALUES ('a', 1)
INSERT INTO Abc (letter, position) VALUES ('b', 2)
INSERT INTO Abc (letter, position) VALUES ('d', 4)
INSERT INTO Abc (letter, position) VALUES ('j', 10)
INSERT INTO Abc (letter, position) VALUES ('z', 26)

SELECT COUNT(*) AS rank, a1.letter
FROM Abc a1
JOIN Abc a2 ON a1.letter > a2.letter
GROUP BY a1.letter
[download]

[reply]
[d/l]

Re^2: Ranking position in a SQL index

by monktim (Friar) on Oct 06, 2004 at 20:14 UTC

JOIN Abc a2 ON a1.letter >= a2.letter
[download]

[reply]
[d/l]

Back to Seekers of Perl Wisdom