From: | Tim Dudgeon <tdudgeon(dot)ml(at)gmail(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | slow sub-query problem |
Date: | 2014-11-17 18:34:34 |
Message-ID: | 546A3FBA.9020901@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I'm having problems optimising a query that's very slow due to a sub-query.
The query is this:
SELECT
structure_id, batch_id, property_id, property_data
FROM chemcentral.structure_props
WHERE structure_id IN (SELECT structure_id FROM
chemcentral.structure_props WHERE property_id = 643413)
AND property_id IN (1, 643413, 1106201);
and it takes 18s to execute.
It I replace the sub-query with the inlined 369 values so that the 4th
line looks like this:
WHERE structure_id IN (1122687,309004,306064 ...)
it takes a few ms.
The plans are:
1. sub-query
"Nested Loop (cost=1132.97..1182.28 rows=43 width=644) (actual
time=70.926..18937.669 rows=381 loops=1)"
" -> HashAggregate (cost=1091.73..1091.75 rows=2 width=4) (actual
time=2.829..3.212 rows=366 loops=1)"
" Group Key: structure_props_1.structure_id"
" -> Index Scan using idx_sp_property_id on structure_props
structure_props_1 (cost=0.43..1090.77 rows=382 width=4) (actual
time=0.033..2.380 rows=369 loops=1)"
" Index Cond: (property_id = 643413)"
" -> Bitmap Heap Scan on structure_props (cost=41.24..45.26 rows=1
width=644) (actual time=51.726..51.727 rows=1 loops=366)"
" Recheck Cond: ((structure_id = structure_props_1.structure_id)
AND (property_id = ANY ('{1,643413,1106201}'::integer[])))"
" Heap Blocks: exact=381"
" -> BitmapAnd (cost=41.24..41.24 rows=1 width=0) (actual
time=51.714..51.714 rows=0 loops=366)"
" -> Bitmap Index Scan on idx_sp_structure_id
(cost=0.00..6.80 rows=317 width=0) (actual time=0.046..0.046 rows=475
loops=366)"
" Index Cond: (structure_id =
structure_props_1.structure_id)"
" -> Bitmap Index Scan on idx_sp_property_id
(cost=0.00..33.90 rows=1146 width=0) (actual time=51.656..51.656
rows=811892 loops=366)"
" Index Cond: (property_id = ANY
('{1,643413,1106201}'::integer[]))"
"Planning time: 0.497 ms"
"Execution time: 18937.868 ms"
2. inlined values
"Bitmap Heap Scan on structure_props (cost=2600.48..2645.29 rows=10
width=644) (actual time=71.676..72.724 rows=381 loops=1)"
" Recheck Cond: ((property_id = ANY ('{1,643413,1106201}'::integer[]))
AND (structure_id = ANY
('{1122687,309004,306064,278852,234066,1122645,412925,280033,423990,568929,448302,278487,278955,40430,40430,467979,467508,288413,289746,306073,355352,265583,4779
(...)"
" Heap Blocks: exact=381"
" -> BitmapAnd (cost=2600.48..2600.48 rows=10 width=0) (actual
time=71.608..71.608 rows=0 loops=1)"
" -> Bitmap Index Scan on idx_sp_property_id (cost=0.00..33.90
rows=1146 width=0) (actual time=54.614..54.614 rows=811892 loops=1)"
" Index Cond: (property_id = ANY
('{1,643413,1106201}'::integer[]))"
" -> Bitmap Index Scan on idx_sp_structure_id
(cost=0.00..2566.32 rows=117367 width=0) (actual time=14.487..14.487
rows=173867 loops=1)"
" Index Cond: (structure_id = ANY
('{1122687,309004,306064,278852,234066,1122645,412925,280033,423990,568929,448302,278487,278955,40430,40430,467979,467508,288413,289746,306073,355352,265583,477941,326652,326602,233964,15338,397586,1122647,3088
(...)"
"Planning time: 1.052 ms"
"Execution time: 72.858 ms"
Table is like this:
CREATE TABLE chemcentral.structure_props
(
id serial NOT NULL,
source_id integer NOT NULL,
structure_id integer NOT NULL,
batch_id character varying(16),
parent_id integer,
property_id integer NOT NULL,
property_data jsonb,
CONSTRAINT structure_props_pkey PRIMARY KEY (id)
)
All relevant columns are indexed and using PostgreSQL 9.4.
Any clues how to re-write it to avoid the slow sub-query.
Many thanks
Tim
From | Date | Subject | |
---|---|---|---|
Next Message | David G Johnston | 2014-11-17 18:44:46 | Re: slow sub-query problem |
Previous Message | Seb | 2014-11-14 17:04:52 | Re: matching against start/end times and diagnostic values |