From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "John Surcombe" <John(dot)Surcombe(at)digimap(dot)gg> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1 |
Date: | 2011-03-13 22:10:29 |
Message-ID: | 5238.1300054229@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I wrote:
> "John Surcombe" <John(dot)Surcombe(at)digimap(dot)gg> writes:
>> When we 'EXPLAIN' this query, PostgreSQL says it is using the index
>> idx_receiveddatetime. The way the application is designed means that in
>> virtually all cases the query will have to scan a very long way into
>> idx_receiveddatetime to find the first record where userid = 311369000.
>> If however we delete the idx_receiveddatetime index, the query uses the
>> idx_userid_receiveddatetime index, and the query only takes a few
>> milliseconds.
> That's just bizarre ... it knows the index is applicable, and the cost
> estimates clearly favor the better index, so why did it pick the worse
> one?
No, scratch that, I misread the plans. It *is* picking the plan it
thinks has lower cost; it's just a mistaken cost estimate. It's strange
though that the less selective indexscan is getting a lower cost
estimate. I wonder whether your table is (almost) perfectly ordered by
receiveddatetime, such that the one-column index has correlation close
to 1.0. That could possibly lower the cost estimate to the point where
it'd appear to dominate the other index. It'd be useful to see the
pg_stats.correlation value for both the userid and receiveddatetime
columns.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Julius Tuskenis | 2011-03-14 08:46:07 | Re: unexpected stable function behavior |
Previous Message | runner | 2011-03-13 16:36:26 | Re: Tuning massive UPDATES and GROUP BY's? |