Re: Less selective index chosen unexpectedly

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, James Coleman <jtc331(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Less selective index chosen unexpectedly
Date: 2021-05-18 22:25:33
Message-ID: CAH2-WznC9xbNJE3ghpZb1u-sRqBN=Y8AFktQbwEOrbsm70z6qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, May 18, 2021 at 2:50 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I think the real winner would be a mechanism to incrementally analyze
> tables, so that it updates the existing stats by sampling only blocks
> that have new data, and "somehow" merge that into the existing
> statistics. You could have such a process run much more frequently than
> standard analyze, because the cost is [supposed to be] smaller.

I wonder if there is a more general design that incorporates changes
over time. That is, a design that has ANALYZE store old statistics for
a table in order to give the system (and the DBA) a sense of how
things change over time. This could enable autoanalyze behavior that
more or less settles on an optimal frequency between ANALYZE
operations -- frequent enough to get stable statistics with some
margin of error, but not too frequent.

I also wonder if this general approach could enable a strategy that
uses something like Bayesian inference to detect bad statistics,
and/or to dampen the consequences of bad estimates.

--
Peter Geoghegan

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2021-05-19 00:35:05 Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE
Previous Message PG Bug reporting form 2021-05-18 22:16:16 BUG #17020: meta command psql \reset does not clear the query buffer