Re: force the use of a particular index

From: Scott Cain <cain(at)cshl(dot)org>
To: Rod Taylor <rbt(at)rbt(dot)ca>
Cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: force the use of a particular index
Date: 2003-07-11 15:36:19
Message-ID: 1057937779.5767.17.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Rod,

I see what you mean about the scan on the feature_pkey taking a long
time. I tried several things to remedy that. I created an index on
feature (feature_id,type_id) (which I don't think makes sense since
feature_id is the primary key, so add another column really doesn't
help). I also created a index on feature (type_id, feature_id), but the
planner doesn't use it. Also, there was an already existing index on
feature (type_id) that the planner never used.

One thing I tried that changed the query plan and improved performance
slightly (but still nowhere near what I need) was to add a partial index
on featureloc on (fmin,fmax) where scrfeature_id=6. This is something I
could realistically do since there are relatively few (>30)
srcfeature_ids that I am interested in, so putting in place a partial
index for each of them would not be a big deal. Nevertheless, the
performance is still not there. Here is the EXPLAIN ANALYZE for this
situation:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=156172.23..156200.11 rows=159 width=54) (actual time=63631.93..63631.93 rows=1 loops=1)
-> Sort (cost=156172.23..156176.21 rows=1594 width=54) (actual time=63631.93..63631.93 rows=1 loops=1)
Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id
-> Hash Join (cost=135100.30..156087.46 rows=1594 width=54) (actual time=63631.29..63631.79 rows=1 loops=1)
Hash Cond: ("outer".feature_id = "inner".feature_id)
-> Index Scan using featureloc_src_6 on featureloc fl (cost=0.00..18064.99 rows=101883 width=14) (actual time=26.11..430.00 rows=570 loops=1)
Index Cond: ((fmin <= 2585581) AND (fmax >= 2565581))
Filter: (srcfeature_id = 6)
-> Hash (cost=134601.43..134601.43 rows=48347 width=40) (actual time=63182.86..63182.86 rows=0 loops=1)
-> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=48347 width=40) (actual time=69.98..62978.27 rows=13825 loops=1)
Filter: (type_id = 219)
Total runtime: 63632.28 msec
(12 rows)

Any other ideas?

Thanks,
Scott

On Fri, 2003-07-11 at 09:38, Rod Taylor wrote:
> On Fri, 2003-07-11 at 13:17, Scott Cain wrote:
> > The problem (at least as it appears to me) is not that it is performing
> > a table scan instead of an index scan, it is that it is using the wrong
> > index. Here is the output from EXPLAIN ANALYZE:
> >
> > QUERY PLAN
> > ------------------------------------------------------------------------------------------------------------------------------------------------------------
> > Unique (cost=494008.47..494037.59 rows=166 width=54) (actual time=114660.37..114660.38 rows=1 loops=1)
> > -> Sort (cost=494008.47..494012.63 rows=1664 width=54) (actual time=114660.37..114660.37 rows=1 loops=1)
> > Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id
> > -> Nested Loop (cost=0.00..493919.44 rows=1664 width=54) (actual time=2596.13..114632.90 rows=1 loops=1)
> > -> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=52231 width=40) (actual time=105.74..56048.87 rows=13825 loops=1)
> > Filter: (type_id = 219)
> > -> Index Scan using featureloc_idx1 on featureloc fl (cost=0.00..6.87 rows=1 width=14) (actual time=4.23..4.23 rows=0 loops=13825)
> > Index Cond: ("outer".feature_id = fl.feature_id)
> > Filter: ((srcfeature_id = 6) AND (fmin <= 2585581) AND (fmax >= 2565581))
> > Total runtime: 114660.91 msec
>
> > it is using on featureloc (featureloc_idx1) is on the foreign key
> > feature_id. It should instead be using another index, featureloc_idx3,
> > which is built on (srcfeature_id, fmin, fmax).
>
> Nope.. The optimizer is right in the decision to use featureloc_idx1.
> You will notice it is expecting to retrieve a single row from this
> index, but the featureloc_idx3 is bound to be larger (due to indexing
> more data), thus take more disk reads for the exact same information (or
> in this case, lack thereof).
>
> What is taking a long time is the scan on feature_pkey. It looks like it
> is throwing away a ton of rows that are not type_id = 219. Either that,
> or you do a pile of deletes and haven't run REINDEX recently.
>
> Create an index consisting of (feature_id, type_id). This will probably
> make a significant different in execution time.
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain(at)cshl(dot)org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Cain 2003-07-11 15:38:47 Re: force the use of a particular index
Previous Message Tom Lane 2003-07-11 15:24:01 Re: force the use of a particular index