Re: Statistics Import and Export

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org
Subject: Re: Statistics Import and Export
Date: 2025-03-06 19:34:41
Message-ID: pmeiyyov44nipdjpaqkow44xt6skckiqg3rnuih3olgnt2rjbd@6esg7dyw2mvp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-03-06 13:47:34 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > And in contrast to analyzing the database in parallel, the pg_dump/restore
> > work to restore stats afaict happens single-threaded for each database.
>
> In principle we should be able to do stats dump/restore parallelized
> just as we do for data.

Yea.

Whether the gains are worth the cost isn't clear to me though. Issuing
individual queries for each relation needs a fair bit of parallelism to catch
up to doing the dumping in a single statement, if it ever can.

> 1. pg_upgrade has made a policy judgement to apply parallelism across
> databases not within a database, ie it will launch concurrent dump/
> restore tasks in different DBs but not authorize any one of them to
> eat multiple CPUs. That needs to be re-thought probably, as I think
> that decision dates to before we had useful parallelism in pg_dump and
> pg_restore. I wonder if we could just rip out pg_upgrade's support
> for DB-level parallelism, which is not terribly pretty anyway, and
> simply pass the -j switch straight to pg_dump and pg_restore.

I don't think that'd work well, right now pg_dump only handles a single
database (pg_dumpall doesn't yet support -Fc) *and* pg_dump is still serial
for the bulk of the work that pg_upgrade cares about.

I think the only parallelism that'd actually happen for pg_upgrade would be
dumping of large objects?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-03-06 19:34:58 Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Previous Message Tom Lane 2025-03-06 19:33:30 Re: ZStandard (with dictionaries) compression support for TOAST compression