Re: Statistics Import and Export

From: Andres Freund <andres(at)anarazel(dot)de>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2025-03-06 17:16:28
Message-ID: vhlnlrhgqaqxcjioyeyjoxuqzaccvsrbgeuqclalzqg5sp4u2k@42sqfxto23gg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-03-06 12:04:25 -0500, Corey Huinker wrote:
> > > If there's value in freeing them, why isn't it being done already? What
> > > other thing would consume this freed memory?
> >
> > I'm not saying that they can be freed, they can't right now. My point is
> > just
> > that we *already* keep all the stats in memory, so the fact that fetching
> > all
> > stats in a single query would also require keeping them in memory is not an
> > issue.
> >
>
> That's true in cases where we're not filtering schemas or tables. We fetch
> the pg_class stats as a part of getTables, but those are small, and not a
> part of the query in question.
>
> Fetching all the pg_stats for a db when we only want one table could be a
> nasty performance regression

I don't think anybody argued that we should fetch all stats regardless of
filtering for the to-be-dumped tables.

> and we can't just filter on the oids of the tables we want, because those
> tables can have expression indexes, so the oid filter would get complicated
> quickly.

I don't follow. We already have the tablenames, schemanames and oids of the
to-be-dumped tables/indexes collected in pg_dump, all that's needed is to send
a list of those to the server to filter there?

> > But TBH, I do wonder how much the current memory usage of the statistics
> > dump/restore support is going to bite us. In some cases this will
> > dramatically
> > increase pg_dump/pg_upgrade's memory usage, my tests were with tiny
> > amounts of
> > data and very simple scalar datatypes and you already could see a
> > substantial
> > increase. With something like postgis or even just a lot of jsonb columns
> > this is going to be way worse.
> >
>
> Yes, it will cost us in pg_dump, but it will save customers from some long
> ANALYZE operations.

My concern is that it might prevent some upgrades from *ever* completing,
because of pg_dump running out of memory.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2025-03-06 17:16:44 Re: Statistics Import and Export
Previous Message Andres Freund 2025-03-06 17:13:21 Re: what's going on with lapwing?