From: | Nicolas Paris <niparisco(at)gmail(dot)com> |
---|---|
To: | Guillaume Lelarge <guillaume(at)lelarge(dot)info> |
Cc: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: QUERY PLANNER - Indexe mono column VS composite Index |
Date: | 2015-07-10 09:34:13 |
Message-ID: | CA+ssMORBzDy65z7+0QT8Yq1yr_LuLMd0w4E8SgLZjALdoAucAA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Ok, here is the problem (it's different than what I explained before)
==INDEX ==
CREATE INDEX of_idx_modifier
ON i2b2data_multi_nomi.observation_fact
USING btree
(concept_cd COLLATE pg_catalog."default", modifier_cd COLLATE
pg_catalog."default", valtype_cd COLLATE pg_catalog."default", tval_char
COLLATE pg_catalog."default", nval_num);
==QUERY==
EXPLAIN ANALYSE select f.patient_num
from i2b2data_multi_nomi.observation_fact f
where
f.concept_cd IN (select concept_cd from
i2b2data_multi_nomi.concept_dimension where concept_path LIKE
'\\i2b2\\cim10\\A00-B99\\%')
AND ( modifier_cd = '@' AND valtype_cd = 'T' AND tval_char IN
('DP') )
group by f.patient_num ;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=35153.99..35154.40 rows=41 width=4) (actual
time=81.223..82.718 rows=5206 loops=1)
Group Key: f.patient_num
-> Nested Loop (cost=4740.02..35089.11 rows=25951 width=4) (actual
time=45.393..76.893 rows=7359 loops=1)
-> HashAggregate (cost=4739.45..4748.64 rows=919 width=10)
(actual time=45.097..45.586 rows=925 loops=1)
Group Key: (concept_dimension.concept_cd)::text
-> Seq Scan on concept_dimension (cost=0.00..4734.73
rows=1892 width=10) (actual time=17.479..44.573 rows=925 loops=1)
Filter: ((concept_path)::text ~~
'\\i2b2\\cim10\\A00-B99\\%'::text)
Rows Removed by Filter: 186413
-> Index Scan using of_idx_modifier on observation_fact f
(cost=0.56..32.86 rows=15 width=14) (actual time=0.025..0.031 rows=8
loops=925)
Index Cond: (((concept_cd)::text =
(concept_dimension.concept_cd)::text) AND ((modifier_cd)::text = '@'::text)
AND ((valtype_cd)::text
= 'T'::text) AND ((tval_char)::text = 'DP'::text))
Planning time: 2.843 ms
Execution time: 83.273 ms
(12 rows)
============2 : without 3 constraint that match index => seq
scan=======================================================================
EXPLAIN ANALYSE select f.patient_num
from i2b2data_multi_nomi.observation_fact f
where
f.concept_cd IN (select concept_cd from
i2b2data_multi_nomi.concept_dimension where concept_path LIKE
'\\i2b2\\cim10\\A00-B99\\%')
-- AND ( modifier_cd = '@' AND valtype_cd = 'T' AND tval_char IN
('DP') )
group by f.patient_num ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1345377.85..1346073.80 rows=69595 width=4) (actual
time=18043.140..18048.741 rows=16865 loops=1)
Group Key: f.patient_num
-> Hash Join (cost=4760.13..1233828.53 rows=44619728 width=4) (actual
time=17109.041..18027.763 rows=33835 loops=1)
Hash Cond: ((f.concept_cd)::text =
(concept_dimension.concept_cd)::text)
-> Seq Scan on observation_fact f (cost=0.00..1057264.28
rows=44619728 width=14) (actual time=0.040..7918.984 rows=44619320 loops=1)
-> Hash (cost=4748.64..4748.64 rows=919 width=10) (actual
time=49.523..49.523 rows=925 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 39kB
-> HashAggregate (cost=4739.45..4748.64 rows=919 width=10)
(actual time=48.806..49.117 rows=925 loops=1)
Group Key: (concept_dimension.concept_cd)::text
-> Seq Scan on concept_dimension (cost=0.00..4734.73
rows=1892 width=10) (actual time=18.828..48.191 rows=925 loops=1)
Filter: ((concept_path)::text ~~
'\\i2b2\\cim10\\A00-B99\\%'::text)
Rows Removed by Filter: 186413
Planning time: 2.588 ms
Execution time: 18051.031 ms
(14 rows)
=========3: without a constraint on tval_char => seq
scan========================================================================
EXPLAIN ANALYSE select f.patient_num
from i2b2data_multi_nomi.observation_fact f
where
f.concept_cd IN (select concept_cd from
i2b2data_multi_nomi.concept_dimension where concept_path LIKE
'\\i2b2\\cim10\\A00-B99\\%')
AND ( modifier_cd = '@' AND valtype_cd = 'T' )
group by f.patient_num ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1305637.84..1305688.23 rows=5039 width=4) (actual
time=22689.279..22694.583 rows=16865 loops=1)
Group Key: f.patient_num
-> Hash Join (cost=4760.13..1297561.67 rows=3230468 width=4) (actual
time=12368.418..22674.145 rows=33835 loops=1)
Hash Cond: ((f.concept_cd)::text =
(concept_dimension.concept_cd)::text)
-> Seq Scan on observation_fact f (cost=0.00..1280362.92
rows=3230468 width=14) (actual time=0.226..22004.808 rows=3195625 loops=1)
Filter: (((modifier_cd)::text = '@'::text) AND
((valtype_cd)::text = 'T'::text))
Rows Removed by Filter: 41423695
-> Hash (cost=4748.64..4748.64 rows=919 width=10) (actual
time=46.833..46.833 rows=925 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 39kB
-> HashAggregate (cost=4739.45..4748.64 rows=919 width=10)
(actual time=46.196..46.515 rows=925 loops=1)
Group Key: (concept_dimension.concept_cd)::text
-> Seq Scan on concept_dimension (cost=0.00..4734.73
rows=1892 width=10) (actual time=18.899..45.800 rows=925 loops=1)
Filter: ((concept_path)::text ~~
'\\i2b2\\cim10\\A00-B99\\%'::text)
Rows Removed by Filter: 186413
Planning time: 1.940 ms
Execution time: 22695.913 ms
What I would like is the planner allways hit of_idx_modifier
Thanks !
2015-07-09 22:49 GMT+02:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info>:
> 2015-07-09 22:34 GMT+02:00 Nicolas Paris <niparisco(at)gmail(dot)com>:
>
>> Hello,
>>
>> My 9.4 database is used as datawharehouse. I can't change the queries
>> generated.
>>
>> first index : INDEX COL (A,B,C,D,E)
>>
>>
>> In case of query based on COL A, the query planner sometimes go to a seq
>> scan instead of using the first composite index.
>>
>> The solution is to add a second indexe (redondant)
>> second index : INDEX COL (A)
>>
>> In case of query based on COL A, B, C, D, (without E) as well, it doesn't
>> uses the first index and prefers a seq scan.
>>
>> I could create a third indexe :
>> first index : INDEX COL (A,B,C,D)
>>
>> But I hope there is an other solution for that (table is huge).
>>
>> It seems that the malus for using composite indexes is high.
>>
>> Question is : is there a way to make the composite index more attractive
>> to query planner ? (idealy equivalent to mono column indexe)
>>
>>
> There's no way we can answer that without seeing actual queries and query
> plans.
>
>
> --
> Guillaume.
> http://blog.guillaume.lelarge.info
> http://www.dalibo.com
>
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2015-07-10 16:20:21 | Re: QUERY PLANNER - Indexe mono column VS composite Index |
Previous Message | Guillaume Lelarge | 2015-07-09 20:49:04 | Re: QUERY PLANNER - Indexe mono column VS composite Index |