Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^4: Efficient search through a huge dataset

by dragonchild (Archbishop)
on Oct 20, 2004 at 02:54 UTC ( [id://400745]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Efficient search through a huge dataset
in thread Efficient search through a huge dataset

Faster and better would be outer joins
SELECT t1.blah FROM t1 LEFT OUTER JOIN t2 ON (t1.blah = t2.blah) WHERE t1.blah IS NOT NULL AND t2.blah IS NULL

It's the correct way to rewrite the NOT EXISTS (about 100x faster in my tests) and is faster than the MINUS because it doesn't have to pull both result sets into memory, then do an expensive ARRAYCOMPARE on them.

Being right, does not endow the right to be rude; politeness costs nothing.
Being unknowing, is not the same as being stupid.
Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Replies are listed 'Best First'.
Re^5: Efficient search through a huge dataset
by Corion (Patriarch) on Oct 20, 2004 at 11:39 UTC

    SQLite doesn't do the full outer join thing, so you have to fake it with two left outer joins :

    -- in table2 but not in table1: SELECT r.id FROM table2 AS r LEFT JOIN table1 AS l ON r.id = l.id WHERE r.system = ? and l.id IS NULL and r.id IS NOT NULL -- in table1 but not in table2: SELECT l.id FROM table2 AS l LEFT JOIN table1 AS r ON l.id = r.id WHERE l.system = ? and l.id IS NOT NULL and r.id IS NULL SQL

    But other than that, SQLite is up to the task, as it includes a nice bulk import statement (SQLite 2.x) and a bulk loader program (SQLite 3.x).

    Update: Added the "full" to "full outer join", as a "left (outer) join" is an outer join too, but one which SQLite does...

      Umm ... Corion? That is an outer join. SQLite just doesn't require the OUTER keyword. :-)

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://400745]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-20 02:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found