Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, "Naik, Sameer" <Sameer_Naik(at)bmc(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans
Date: 2019-04-29 16:51:36
Message-ID: 28209.1556556696@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2019-04-29 10:35:39 -0400, Tom Lane wrote:
>> This isn't the first time we've seen a plan-choice failure of this sort.
>> I've wondered if we should make the plancache simply disbelieve generic
>> cost estimates that are actually cheaper than the custom plans, on the
>> grounds that they must be estimation errors. In principle a generic
>> plan could never really be better than a custom plan; so if it looks
>> that way on a cost basis, what that probably means is that the actual
>> parameter values are outliers of some sort (e.g. extremely common),
>> and the custom plan "knows" that it's going to be taking a hit from
>> that, but the generic plan doesn't. In this sort of situation, going
>> with the generic plan could be really disastrous, which is exactly
>> what the OP is seeing (and what we've seen reported before).
>>
>> However, I'm not sure how to tune this idea so that it doesn't end up
>> rejecting perfectly good generic plans. It's likely that there will be
>> some variation in the cost estimates between the generic and specific
>> cases, even if the plan structure is exactly the same; and that
>> variation could go in either direction.

> Yea, I've both seen the "generic is cheaper due to averaged selectivity"
> and the "insignificant cost variations lead to always prefer custom
> plan" problems in production.

I wonder if we couldn't do something based on having seen several
different custom plans before we try to make this decision. It'd be
just about free to track the min and max custom cost estimates, along
with their average. The case where it is sensible to be switching to
a generic plan is where all the plans come out looking more or less
alike --- if the workload is such that we get markedly different plans
for different inputs, then we'd probably better just eat the cost of
planning every time. So maybe the rule should be something like
"if the min and max custom costs, as well as the generic cost
estimate, are all within 10% of the average custom cost, then it's
okay to switch to generic". We might need to collect more than 5
custom estimates before we put much faith in the decision, too.

> I've also - but less severely - seen that the "planning cost" we add to
> the custom plan leads to the generic plan to always be preferred.

Yeah; the planning cost business is very much of a hack, because we
don't have a good handle on how that really relates to execution
costs. But if we're thinking of the decision as being risk-based,
which is basically what I'm suggesting above, maybe we could just
drop that whole component of the algorithm?

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Naik, Sameer 2019-04-30 04:58:50 RE: Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans
Previous Message Andres Freund 2019-04-29 16:23:38 Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans