Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jan Kort <jan(dot)kort(at)genetics(dot)nl>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE
Date: 2021-05-19 07:51:58
Message-ID: CAApHDvpVRYPHrDF4QmXemfV6FYEtOp7eRMcO1LXYUVVdYc-_bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 19 May 2021 at 14:13, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> That makes me think the best fix would be to do something better
> during ANALYZE and maybe try and include some more upper bound MCVs.
> I'm not yet too sure what drawbacks there might be from doing that.

I had a quick look at the ANALYZE code and see that we only consider
items for the MCV list when they appear more than once in the analyzed
set. When it comes to the histogram, we only consider making the
histogram if there are at least 2 items that are not covered in the
MCV list. The comment mentions:

/*
* Generate a histogram slot entry if there are at least two distinct
* values not accounted for in the MCV list. (This ensures the
* histogram won't collapse to empty or a singleton.)
*/

So given we only have 2 distinct values in the "three" table and one
of those is tracked in the MCV list, there are not enough values
remaining to build a histogram.

It seems you've hit about the worst-case here. If you'd had 1 more
value in the gfo_zaken_kosten table then that would have been enough
to build a histogram. Or if the 98 value was not duplicated then we'd
not have built an MCV list and built a histogram with the two values
instead.

If you want a workaround, you could do:

alter table trial.gfo_zaken_kosten alter column gfo_zaken_id set statistics 0;
delete from pg_statistic where
starelid='trial.gfo_zaken_kosten'::regclass and staattnum=2;
analyze trial.gfo_zaken_kosten;

that's a bit dirty though. You'd need to do:

alter table trial.gfo_zaken_kosten alter column gfo_zaken_id set statistics -1;
analyze trial.gfo_zaken_kosten;

if that table was to ever change, else you might get bad plans due to
lack of stats.

David

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2021-05-19 09:33:57 BUG #17023: wal_log_hints not configured even if it on
Previous Message PG Bug reporting form 2021-05-19 06:58:19 BUG #17022: SQL causing engine crash