Re: QUERY PLANNER - Indexe mono column VS composite Index

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Nicolas Paris <niparisco(at)gmail(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: QUERY PLANNER - Indexe mono column VS composite Index
Date: 2015-07-10 16:20:21
Message-ID: CAMkU=1wa_rMjE592+FotD0VLGXP-2FCrd_iEaiHr-s2BRW-v+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jul 10, 2015 at 2:34 AM, Nicolas Paris <niparisco(at)gmail(dot)com> wrote:

>
>
> =========3: without a constraint on tval_char => seq
> scan========================================================================
>
>
> EXPLAIN ANALYSE select f.patient_num
> from i2b2data_multi_nomi.observation_fact f
> where
> f.concept_cd IN (select concept_cd from
> i2b2data_multi_nomi.concept_dimension where concept_path LIKE
> '\\i2b2\\cim10\\A00-B99\\%')
> AND ( modifier_cd = '@' AND valtype_cd = 'T' )
> group by f.patient_num ;
>
>
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=1305637.84..1305688.23 rows=5039 width=4) (actual
> time=22689.279..22694.583 rows=16865 loops=1)
> Group Key: f.patient_num
> -> Hash Join (cost=4760.13..1297561.67 rows=3230468 width=4) (actual
> time=12368.418..22674.145 rows=33835 loops=1)
> Hash Cond: ((f.concept_cd)::text =
> (concept_dimension.concept_cd)::text)
> -> Seq Scan on observation_fact f (cost=0.00..1280362.92
> rows=3230468 width=14) (actual time=0.226..22004.808 rows=3195625 loops=1)
> Filter: (((modifier_cd)::text = '@'::text) AND
> ((valtype_cd)::text = 'T'::text))
> Rows Removed by Filter: 41423695
> -> Hash (cost=4748.64..4748.64 rows=919 width=10) (actual
> time=46.833..46.833 rows=925 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 39kB
> -> HashAggregate (cost=4739.45..4748.64 rows=919
> width=10) (actual time=46.196..46.515 rows=925 loops=1)
> Group Key: (concept_dimension.concept_cd)::text
> -> Seq Scan on concept_dimension
> (cost=0.00..4734.73 rows=1892 width=10) (actual time=18.899..45.800
> rows=925 loops=1)
> Filter: ((concept_path)::text ~~
> '\\i2b2\\cim10\\A00-B99\\%'::text)
> Rows Removed by Filter: 186413
> Planning time: 1.940 ms
> Execution time: 22695.913 ms
>
> What I would like is the planner allways hit of_idx_modifier
>

What does the above explain analyze query give when you have an index on
just modifier_cd, or maybe on both (modifier_cd, valtype_cd)?

Your original email said it uses the index in that case, but we would need
to see the numbers in the query plan in order to figure out why it is doing
that.

It seems like that the "tval_char IN ('DP')" part of the restriction is
very selective, while the other two restrictions are not.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message James Cloos 2015-07-11 23:18:21 Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Previous Message Nicolas Paris 2015-07-10 09:34:13 Re: QUERY PLANNER - Indexe mono column VS composite Index