From: | "Billy G(dot) Allie" <Bill(dot)Allie(at)mug(dot)org> |
---|---|
To: | Patrick Nelson <pnelson(at)neatech(dot)com> |
Cc: | "PostgreSQL List (E-mail)" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Proper Join and check |
Date: | 2002-10-07 06:01:13 |
Message-ID: | 200210070601.g9761DI08907@bajor.mug.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Patrick Nelson wrote:
> Billy G. Allie wrote:
> ----------------->>>>
> The query you are looking for is:
>
> select a.sym from tableone a
> where a.sym not in (select b.sym from tablemany b
> where b.sym = a.sym);
>
> This query will run MUCH faster if you create a secondary index for
> tablemany (in fact, don't even try it without the index for any significant
> number or rows):
[. . .]
> Oh yeah it helped... Thanks the query took like 4 seconds. I'm not sure I
> totally understand the b.sym = a.sym though.
With the b.sym = a.sym, the query optimizer will use an index scan of tablemany.
Without it, a sequential scan will be used (very VERY S L O W for any signifi-
cant number of rows).
Without the b.sym = a.sym, the result of the subselect will be all the rows in
tablemany which will have to be scanned to see if the test (not in) succeeds.
With the b.sym = a.sym, the result of the subselect will be empty or the value
of b.sym (as determined by an index lookup, which is why the index was needed).
I hope this clarifies things somewhat.
___________________________________________________________________________
____ | Billy G. Allie | Domain....: Bill(dot)Allie(at)mug(dot)org
| /| | 7436 Hartwell | MSN.......: B_G_Allie(at)email(dot)msn(dot)com
|-/-|----- | Dearborn, MI 48126|
|/ |LLIE | (313) 582-1540 |
From | Date | Subject | |
---|---|---|---|
Next Message | Shridhar Daithankar | 2002-10-07 07:15:57 | Re: understanding insert slowdown |
Previous Message | Tom Lane | 2002-10-07 03:20:33 | cross-posts (was Re: Large databases, performance) |