From: | Stuart Rison <stuart(at)ludwig(dot)ucl(dot)ac(dot)uk> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | problems with a sub-select (takes donkeys' years) |
Date: | 1999-03-29 16:32:57 |
Message-ID: | v04003a05b325562e3c99@[128.40.242.190] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
I've got two SELECT statements which work fine (take about 5secs each) when
I do them sequentially but take ages (around 5mins) if in a statement with
a sub-SELECT.
The individual SELECTs are:
SELECT p1.brecard_id FROM
malignant_pathologies p1,
malignant_pathologies p2,
malignant_pathologies p3
WHERE p1.code='MAPH'
AND p1.brecard_id=p2.brecard_id
AND p2.code='AMCA'
AND p3.brecard_id=p1.brecard_id
AND p3.code='LOCA';
brecard_id
----------------
DSTL12031999016
DPHA12031999017
DCCH12031999056
DUCH12031999059
DUCH12031999063
DCCH12031999077
DUCH12031999098
(7 rows)
SELECT brecard_id,count(brecard_id) FROM
malignant_pathologies
WHERE brecard_id IN
('DSTL12031999016','DPHA12031999017','DCCH12031999056','DUCH12031999059','DUCH12
031999063','DCCH12031999077','DUCH12031999098')
GROUP BY brecard_id;
brecard_id |count
----------------+-----
DCCH12031999056 | 10
DCCH12031999077 | 12
DPHA12031999017 | 11
DSTL12031999016 | 11
DUCH12031999059 | 13
DUCH12031999063 | 6
DUCH12031999098 | 14
(7 rows)
(This is just an example query, what I'm trying to do here is use the list
of brecard_id's generated by the previous query).
Both of these execute in about 5 seconds (on 1000 rows)
When I put them together as:
SELECT brecard_id,count(brecard_id) FROM
malignant_pathologies
WHERE brecard_id IN (
SELECT p1.brecard_id FROM
malignant_pathologies p1,
malignant_pathologies p2,
malignant_pathologies p3
WHERE p1.code='MAPH'
AND p1.brecard_id=p2.brecard_id
AND p2.code='AMCA'
AND p3.brecard_id=p1.brecard_id
AND p3.code='LOCA');
It takes around 5mins for the query to complete! The EXPLAIN for the quey is:
NOTICE: QUERY PLAN:
Aggregate (cost=4.27 size=0 width=0)
-> Group (cost=4.27 size=0 width=0)
-> Sort (cost=4.27 size=0 width=0)
-> Seq Scan on malignant_pathologies (cost=4.27 size=99
width=12)
SubPlan
-> Nested Loop (cost=8.27 size=1 width=36)
-> Nested Loop (cost=6.27 size=1 width=24)
-> Seq Scan on malignant_pathologies p3
(cost=4.27 size=1 width=12)
-> Index Scan using
malignant_pathologies_pkey on malignant_pathologies p1 (cost=2.00 size=1
width=12)
-> Index Scan using malignant_pathologies_pkey
on malignant_pathologies p2 (cost=2.00 size=1 width=12)
EXPLAIN
[Which is not different from putting together the EXPLAINs from each of the
individual queries]
Can anyone explain why the sub-query form takes so long?
Regards,
Stuart.
+-------------------------+--------------------------------------+
| Stuart Rison | Ludwig Institute for Cancer Research |
+-------------------------+ 91 Riding House Street |
| Tel. (0171) 878 4041 | London, W1P 8BT, UNITED KINGDOM. |
| Fax. (0171) 878 4040 | stuart(at)ludwig(dot)ucl(dot)ac(dot)uk |
+-------------------------+--------------------------------------+
From | Date | Subject | |
---|---|---|---|
Next Message | Gerard Saraber | 1999-03-29 16:44:09 | backing up pgsql |
Previous Message | Herouth Maoz | 1999-03-29 16:00:00 | Re: [GENERAL] Returning an integer from a date |