From: | Rajarshi Guha <rguha(at)indiana(dot)edu> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | suggestions on improving a query |
Date: | 2007-02-12 20:23:40 |
Message-ID: | 1171311821.7875.31.camel@localhost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi, I have a query that involves 3 tables. T
select pubchem_compound.openeye_can_smiles, pubchem_compound.nist_inchi, dock.cid, dockscore_plp.*
from dock, dockscore_plp, pubchem_compound
where
dock.target = '1YC1' and
dock.dockid = dockscore_plp.id and
dock.cid = pubchem_compound.cid
order by dockscore_plp.total
limit 10;
The output of explain analyze is
Limit (cost=0.00..387.36 rows=10 width=297) (actual time=242977.644..462748.215 rows=10 loops=1)
-> Nested Loop (cost=0.00..37325186.12 rows=963575 width=297) (actual time=242977.638..462748.175 rows=10 loops=1)
-> Nested Loop (cost=0.00..31523550.51 rows=963575 width=90) (actual time=242900.629..461810.902 rows=10 loops=1)
-> Index Scan using plp_total_idx on dockscore_plp (cost=0.00..16733229.92 rows=4669988 width=80) (actual time=98.323..322537.605 rows=25197 loops=1)
-> Index Scan using dock_pkey on dock (cost=0.00..3.15 rows=1 width=18) (actual time=5.521..5.521 rows=0 loops=25197)
Index Cond: (dock.dockid = "outer".id)
Filter: (target = '1YC1'::text)
-> Index Scan using pubchem_compound_pkey on pubchem_compound (cost=0.00..6.01 rows=1 width=216) (actual time=93.699..93.704 rows=1 loops=10)
Index Cond: (("outer".cid)::text = (pubchem_compound.cid)::text)
Total runtime: 462748.439 ms
(10 rows)
Now, the tables 'dock' and 'dockscore_plp' have 4.6M rows and
'pubchem_compound' has 10M rows.
However the clause:
dock.target = '1YC1' and
dock.dockid = dockscore_plp.id
reduces the number of rows from 4.6M to 96K. I had figured that after
this the query would be very fast. But the explain analyze seems to
indicate that the dockscore_plp table is being sorted (using the index
plp_total_idx) in its entirety. This would then be the bottleneck
Is this a correct interpretation?
What I expected was that the sort would be applied to the 96K subset of
dockscore_plp, rather than the whole table.
Is it possible to restructure the query such that the sort is done on
96K rows rather than 4.6M rows?
Thanks,
-------------------------------------------------------------------
Rajarshi Guha <rguha(at)indiana(dot)edu>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
"I'd love to go out with you, but my favorite commercial is on TV."
From | Date | Subject | |
---|---|---|---|
Next Message | Reid Thompson | 2007-02-12 20:41:19 | Re: Testing embedded SQL in C |
Previous Message | HT NB | 2007-02-12 20:16:54 | Testing embedded SQL in C |