From: | Perry Smith <pedzsan(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Help with sql |
Date: | 2012-07-06 21:34:57 |
Message-ID: | 06772287-B2D0-4C9C-826C-7D49E1015978@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Guys,
This isn't a PostgreSQL specific question but just a SQL question. If this is not an appropriate question for this list, please let me know.
It is also, perhaps, a really silly question.
This query (without the 'explain' keyword) , when executed takes forever and a day:
> condor_development=> explain select id from filesets where id not in ( select fileset_id from service_pack_fileset_maps );
> QUERY PLAN
> ----------------------------------------------------------------------------------------------
> Seq Scan on filesets (cost=0.00..71937742.00 rows=26088 width=4)
> Filter: (NOT (SubPlan 1))
> SubPlan 1
> -> Materialize (cost=0.00..2517.78 rows=95852 width=4)
> -> Seq Scan on service_pack_fileset_maps (cost=0.00..1663.52 rows=95852 width=4)
> (5 rows)
This query returns within a second:
> condor_development=> explain select id from filesets where id not in ( select distinct fileset_id from service_pack_fileset_maps );
> QUERY PLAN
> ----------------------------------------------------------------------------------------------
> Seq Scan on filesets (cost=2102.31..3153.53 rows=26088 width=4)
> Filter: (NOT (hashed SubPlan 1))
> SubPlan 1
> -> HashAggregate (cost=1903.15..2062.48 rows=15933 width=4)
> -> Seq Scan on service_pack_fileset_maps (cost=0.00..1663.52 rows=95852 width=4)
> (5 rows)
The difference is the "distinct" keyword in the inner select.
What I'm confused about is why isn't the "distinct" implicit? I thought the construct "blah in ( select ... )" was using "sets" and an item (I thought) can not be in a set more than once.
Perhaps my question is the opposite really? Why would you not always use "distinct" in the inner select when the operator is "in" or "not in" ?
And if I can throw in another question on top: is there a different method other than "not in" that would work better?
Thank you guys for the help and a really awesome database.
pedz
(this is *suppose* to be sent as plain text... I hope my mailer does what it is told)
From | Date | Subject | |
---|---|---|---|
Next Message | Steven Schlansker | 2012-07-06 23:25:05 | Re: Suboptimal query plan fixed by replacing OR with UNION |
Previous Message | David Johnston | 2012-07-06 20:50:21 | Re: Weird "LIKE" behaviour |