From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | "Naik, Sameer" <Sameer_Naik(at)bmc(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans |
Date: | 2019-04-29 14:35:39 |
Message-ID: | 22414.1556548539@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Mon, Apr 29, 2019 at 10:36:20AM +0000, Naik, Sameer wrote:
>> Since Postgres 9.2, for prepared statements, the CBO automatically switches
>> from Custom Plan to Generic plan on the sixth iteration (reference backend/
>> utils/cache/plancache.c).
> This is not totally true.
Yeah, that's a pretty inaccurate statement of the behavior.
The problem seems to be that the actual values being used for
c400129200 and c400127400 are quite common in the dataset,
so that when considering
Filter: ... (c400129200 = '0'::citext) AND (c400127400 = 'DATASET1M'::citext)
the planner makes a roughly correct assessment that there are a lot of
such rows, so it prefers to index on the basis of the giant OR clause
instead, even though that's fairly expensive. But, when considering
the generic case
-> Index Scan using i776_0_400129200_t776 on t776 (cost=0.42..12.66 rows=1 width=52) (actual time=1190.399..5544.385 rows=48 loops=1)
Index Cond: ((c400129200 = $1) AND (c400127400 = $2))
it's evidently guessing that just a few rows will match the index
condition (no more than about 3 given the cost number), making this plan
look much cheaper, so it goes with this plan. I wonder what the actual
distribution of those keys is.
In v10 and later, it's quite possible that creating extended stats
on the combination of those two columns would produce a better
estimate. Won't help OP on 9.6, though.
This isn't the first time we've seen a plan-choice failure of this sort.
I've wondered if we should make the plancache simply disbelieve generic
cost estimates that are actually cheaper than the custom plans, on the
grounds that they must be estimation errors. In principle a generic
plan could never really be better than a custom plan; so if it looks
that way on a cost basis, what that probably means is that the actual
parameter values are outliers of some sort (e.g. extremely common),
and the custom plan "knows" that it's going to be taking a hit from
that, but the generic plan doesn't. In this sort of situation, going
with the generic plan could be really disastrous, which is exactly
what the OP is seeing (and what we've seen reported before).
However, I'm not sure how to tune this idea so that it doesn't end up
rejecting perfectly good generic plans. It's likely that there will be
some variation in the cost estimates between the generic and specific
cases, even if the plan structure is exactly the same; and that
variation could go in either direction.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-04-29 16:23:38 | Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans |
Previous Message | Bruce Momjian | 2019-04-29 13:36:07 | Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans |