From: | "John Surcombe" <John(dot)Surcombe(at)digimap(dot)gg> |
---|---|
To: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1 |
Date: | 2011-03-12 10:07:41 |
Message-ID: | C0B87DEF8EF4F94C9D8FBD3151B627B26C0C23@digimapserver.digimap.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello,
We are using PostgreSQL 9.0.3, compiled by Visual C++ build 1500,
32-bit, installed on Windows 2003 R2 32-bit.
We have an 'aisposition' table used for a GPS tracking application,
containing ~30 million rows and a number of indexes. Two of these are:
idx_receiveddatetime: indexes aisposition(receiveddatetime timestamp)
idx_userid_receiveddatetime: indexes aisposition(userid int4 desc,
receiveddatetime timestamp desc)
The problem we get is that the following query is taking many minutes to
run:
select * from aisposition where userid = 311369000 order by userid desc,
receiveddatetime desc limit 1
When we 'EXPLAIN' this query, PostgreSQL says it is using the index
idx_receiveddatetime. The way the application is designed means that in
virtually all cases the query will have to scan a very long way into
idx_receiveddatetime to find the first record where userid = 311369000.
If however we delete the idx_receiveddatetime index, the query uses the
idx_userid_receiveddatetime index, and the query only takes a few
milliseconds.
The EXPLAIN ANALYZE output with idx_receiveddatetime in place is:
Limit (cost=0.00..1.30 rows=1 width=398) (actual
time=1128097.540..1128097.541 rows=1 loops=1)
-> Index Scan Backward using idx_receiveddatetime on aisposition
(cost=0.00..2433441.05 rows=1875926 width=398) (actual
time=1128097.532..1128097.532 rows=1 loops=1)
Filter: (userid = 311369000)
Total runtime: 1128097.609 ms
And with that index deleted:
Limit (cost=0.00..4.01 rows=1 width=398) (actual time=60.633..60.634
rows=1 loops=1)
-> Index Scan using idx_userid_receiveddatetime on aisposition
(cost=0.00..7517963.47 rows=1875926 width=398) (actual
time=60.629..60.629 rows=1 loops=1)
Index Cond: (userid = 311369000)
Total runtime: 60.736 ms
We would obviously prefer PostgreSQL to use the
idx_userid_receiveddatetime index in all cases, because we know that
this will guarantee results in a timely manner, whereas using
idx_receiveddatetime will usually require a scan of much of the table
and our application will not work. What are we doing wrong?
Cheers now,
John
From | Date | Subject | |
---|---|---|---|
Next Message | Marti Raudsepp | 2011-03-12 17:07:29 | Re: Tuning massive UPDATES and GROUP BY's? |
Previous Message | Kevin Grittner | 2011-03-11 22:32:03 | Re: ANTI-JOIN needs table, index scan not possible? |