From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | kawamichi(at)tkl(dot)iis(dot)u-tokyo(dot)ac(dot)jp, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Erroneous cost estimation for nested loop join |
Date: | 2015-11-12 19:02:48 |
Message-ID: | CAMkU=1xP3k+BfdmWOQbsGX0Pf2W_nu_7gfOgOThqt2j8=StNqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 9, 2015 at 2:42 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 9 November 2015 at 10:08, <kawamichi(at)tkl(dot)iis(dot)u-tokyo(dot)ac(dot)jp> wrote:
>>
>>
>> We guessed the cause of this error would be in the cost model of Postgres,
>> and investigated the source code of optimizer, and we found the cause of
>> this problem. It was in the index cost estimation process. On scanning inner
>> table, if loop count is greater than 1, its I/O cost is counted as random
>> access. In the case of Query2, in one loop (i.e. one inner table scan) ,
>> much data is read sequentially with clustered index, so it seems to be wrong
>> that optimizer thinks its I/O workload is random access.
>
>
> We don't have a clustered index in Postgres. We do store correlation stats
> for the index, which is used in some places to reduce cost.
>
> Could you look some more at this and see if a model change that uses the
> correlation can improve this?
That is already happening. min_IO_cost is set on the assumption of
perfect correlation, max_IO_cost is set on the assumption of no
correlation, and then later in the code it interpolates between these
two based on the observed correlation:
run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
But the min_IO_cost is very pessimistic in this particular case. So I
think that their proposed change is already exactly what you are
asking them to try.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2015-11-12 19:11:08 | Re: Proposal: "Causal reads" mode for load balancing reads without stale data |
Previous Message | Thomas Munro | 2015-11-12 18:25:42 | Re: Proposal: "Causal reads" mode for load balancing reads without stale data |