From: | Thom Brown <thombrown(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Wildly inaccurate query plan |
Date: | 2010-05-28 19:05:40 |
Message-ID: | AANLkTimD7onzq_mLk1bYdGoYW3yaS_iqNH53UqF_Uzyx@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 28 May 2010 19:54, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thom Brown <thombrown(at)gmail(dot)com> writes:
>> I get this:
>
>> Limit (cost=0.00..316895.11 rows=400 width=211) (actual
>> time=3.880..1368.936 rows=400 loops=1)
>> -> GroupAggregate (cost=0.00..41843621.95 rows=52817 width=211)
>> (actual time=3.872..1367.048 rows=400 loops=1)
>> -> Index Scan using "binaryID_2576_idx" on parts_2576
>> (cost=0.00..41683754.21 rows=10578624 width=211) (actual
>> time=0.284..130.756 rows=19954 loops=1)
>> Index Cond: (("binaryID")::text >
>> '1082fa89fe499741b8271f9c92136f44'::text)
>> Total runtime: 1370.140 ms
>
>> The first thing which strikes me is how the GroupAggregate step shows
>> it got the 400 rows which matches the limit, but it estimated 52,817
>> rows. Shouldn't it have already known it would be 400?
>
> No. Rowcount estimates are always in terms of what the node would emit
> if allowed to run to completion. Likewise cost. In this case both the
> indexscan and the groupagg are terminated early once they satisfy the
> limit. The planner is expecting this which is why the estimated cost
> for the limit node is way less than those for its inputs.
>
> That looks like a perfectly reasonable plan from here, though it would
> probably not get chosen with a larger limit or no limit at all, since
> the ultimate costs are pretty large. Essentially this is a fast-start
> plan rather than a lowest-total-cost plan, and that looks like the
> best bet for a small limit value.
>
> regards, tom lane
You're absolutely right, it's not chosen when without limit. I see
what you mean though about terminating once it has enough rows. It's
a shame I can't optimise it though as the real case that runs is with
a limit of 4000 which takes a long time to complete.
Thanks
Thom
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2010-05-28 19:48:54 | Zeus IOPS |
Previous Message | Greg Smith | 2010-05-28 18:57:54 | Re: shared_buffers advice |