Quick Links

Re: [SQL] Replacing a simple nested query?

From:	Steve Wampler <swampler(at)noao(dot)edu>
To:	pgsql-sql(at)postgresql(dot)org, Postgres-performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: [SQL] Replacing a simple nested query?
Date:	2003-07-14 13:38:18
Message-ID:	1058189898.23013.135.camel@weaver.tuc.noao.edu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance pgsql-sql

On Sun, 2003-07-13 at 14:50, Steve Wampler wrote:
> I've got a simple nested query:
>
> select * from attributes where id in (select id from
> attributes where (name='obsid') and (value='oid00066'));
>
> that performs abysmally. I've heard this described as the
> 'classic WHERE IN' problem.
>
> Is there a better way to obtain the same results? The inner
> select identifies a set of ids (2049 of them, to be exact)
> that are then used to locate records that have the same id
> (about 30-40K of those, including the aforementioned 2049).

For the record, Joe Conway and Hannu Krosing both provided
the same solution:

select at.* from attributes_table at, attributes a
where at.id = a.id and a.name='obsid' and a.value='oid00066';

which is several orders of infinity faster than than my naive
approach above:
-------------------------------------------------------------
lab.devel.configdb=# explain analyze select * from
attributes_table where id in (select id from attributes
where (name='obsid') and (value='oid00066')) order by id;
NOTICE: QUERY PLAN:

Index Scan using id_index on attributes_table (cost=0.00..8773703316.10
rows=241201 width=59) (actual time=136297.91..3418016.04 rows=32799
loops=1)
SubPlan
-> Materialize (cost=18187.48..18187.48 rows=15 width=25) (actual
time=0.01..1.68 rows=1979 loops=482402)
-> Index Scan using name_index on attributes_table
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.27..251.95
rows=2049 loops=1)
Total runtime: 3418035.38 msec
--------------------------------------------------------------
lab.devel.configdb=# explain analyze select at.* from
attributes_table at, attributes a
where at.id = a.id and a.name='obsid' and a.value='oid00066';
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..18739.44 rows=217 width=84) (actual
time=0.76..1220.65 rows=32799 loops=1)
-> Index Scan using name_index on attributes_table
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.47..507.31
rows=2049 loops=1)
-> Index Scan using id_index on attributes_table at
(cost=0.00..35.80 rows=12 width=59) (actual time=0.11..0.31 rows=16
loops=2049)
Total runtime: 1235.42 msec
-------------------------------------------------------------------

My thanks to both Joe and Hannu!
Steve
--
Steve Wampler -- swampler(at)noao(dot)edu
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?

In response to

Replacing a simple nested query? at 2003-07-13 21:50:42 from Steve Wampler

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Paul Thomas	2003-07-14 14:06:39	Re: Pgsql - Red Hat Linux - VS MySQL VS MSSQL
Previous Message	Andrew Sullivan	2003-07-14 11:45:31	Re: Pgsql - Red Hat Linux - VS MySQL VS MSSQL

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Scott Cain	2003-07-14 17:01:36	Cannot insert dup id in pk
Previous Message	markus brosch	2003-07-14 12:57:19	Re: max length of sql select statement (long!)