From: | Mark Williams <mark(dot)williams(at)jivesoftware(dot)com> |
---|---|
To: | |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Bad Query Plan with Range Query |
Date: | 2011-04-15 18:06:51 |
Message-ID: | 4DA8893B.8050403@jivesoftware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks for the response guys. There is something else which confuses me.
If I re-write the query like this:
explain SELECT messageID FROM Message WHERE modificationDate >=
1302627793988 ORDER BY modificationDate LIMIT 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Limit (cost=0.00..2.97 rows=1 width=16)
-> Index Scan using jvmssg_mdate_idx on message
(cost=0.00..3705.59 rows=1249 width=16)
Index Cond: (modificationdate >= 1302627793988::bigint)
(3 rows)
I also get a better plan. However, this is not always the case. On some
other instances we still get a sequential scan on the primary key.
On 04/15/2011 10:54 AM, Kevin Grittner wrote:
> Mark Williams<mark(dot)williams(at)jivesoftware(dot)com> wrote:
>
>> explain SELECT min(messageID) FROM Message
>> WHERE modificationDate>= 1302627793988;
>
>> For some reason it is deciding to scan the primary key column of
>> the table. This results in scanning the entire table
>
> No, it scans until it finds the first row where modificationDate>=
> 1302627793988, at which point the scan is done because it's doing an
> ascending scan on what you want the min() of. You might have a clue
> that the first such row will be ten million rows into the scan, but
> the optimizer doesn't know that. It's assuming that rows which meet
> that condition are scattered randomly through the primary key range.
> It thinks that it will, on average, need to scan 1249 rows to find a
> match.
>
> The patch Ken referenced causes the alternative to be assigned a
> more accurate (and lower) cost, which tips the scales in favor of
> that plan -- at least for the case you've tried; but this seems to
> me to be another case related to the correlation of values. It's a
> new and different form of it, but it seems at least somewhat
> related. It might be a good example for those working on
> multi-column statistics to keep in mind.
>
> -Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2011-04-15 18:13:26 | Re: Bad Query Plan with Range Query |
Previous Message | Kevin Grittner | 2011-04-15 17:54:18 | Re: Bad Query Plan with Range Query |