Re: Statistics Import and Export

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2024-10-31 13:52:12
Message-ID: CADkLM=eSnbrOPfof8JeSZriFXnqpYcyRw8-WhL5nesbkduDdUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> (c) we are considering whether to use an in-place heap update for the
> relation stats, so that a large restore doesn't bloat pg_class -- I'd
> like feedback on this idea
>

I'd also like feedback, though I feel very strongly that we should do what
ANALYZE does. In an upgrade situation, nearly all tables will have stats
imported, which would result in an immediate doubling of pg_class - not the
end of the world, but not great either.

Given the recent bugs associated with inplace updates and race conditions,
if we don't want to do in-place here, we should also consider getting rid
of it for ANALYZE. I briefly pondered if it would make sense to vertically
partition pg_class into the stable attributes and the attributes that get
modified in-place, but that list is pretty long: relpages, reltuples,
relallvisible, relhasindex, reltoastrelid, relhasrules, relhastriggers,
relfrozenxid, and reminmxid,

If we don't want to do inplace updates in pg_restore_relation_stats(), then
we could mitigate the bloat with a VACUUM FULL pg_class at the tail end of
the upgrade if stats were enabled.

> pg_restore_*_stats() functions. But there's a lot of overlap, so it may
> be worth discussing again whether we should only have one set of
> functions.
>

For the reason of in-place updates and error tolerance, I think they have
to remain separate functions, but I'm also interested in hearing other's
opinions.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melih Mutlu 2024-10-31 13:58:19 Re: Separate memory contexts for relcache and catcache
Previous Message Thom Brown 2024-10-31 13:29:56 Re: MultiXact\SLRU buffers configuration