Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2025-03-06 17:16:44
Message-ID: CADkLM=dTTf9tonKsQnDCDp5oyODE2mFV7K6nFuCp84QE3GBWuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> To be honest, I am a bit surprised that we decided to enable this by
> default. It's not obvious to me that statistics should be regarded as
> part of the database in the same way that table definitions or table
> data are. That said, I'm not overwhelmingly opposed to that choice.
> However, even if it's the right choice in theory, we should maybe
> rethink if it's going to be too slow or use too much memory.
>

I'm strongly in favor of the choice to make it default. This is reducing
the impact of a post-upgrade customer footgun wherein heavy workloads are
applied to a database post-upgrade but before analyze/vacuumdb have had a
chance to do their magic [1].

It seems to me that we're fretting over seconds when the feature is
potentially saving the customer hours of reduced availability if not
outright downtime.

[1] In that situation, the workload queries have no stats, get terrible
plans, everything becomes a sequential scan. Sequential scans swamp the
system, starving the analyze commands of the I/O they need to get the badly
needed statistics. Even after the stats are in place, the system is still
swamped with queries that were in flight before the stats were in place.
Even well intentioned customers [2] can fall prey to this when their
microservices detect that the database is online again, and automatically
resume work.

[2] This exact situation happened at a place where I was consulting. The
microservices all restarted work automatically despite assurances that they
would not. That bad experience was my primary motivator for implementing
theis feature.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2025-03-06 17:18:57 Re: zstd failing on mipsel (PG 15.12, pg_verifybackup/t/010_client_untar.pl)
Previous Message Andres Freund 2025-03-06 17:16:28 Re: Statistics Import and Export