From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, alvherre(at)alvh(dot)no-ip(dot)org |
Subject: | Re: Statistics Import and Export |
Date: | 2025-03-06 19:47:08 |
Message-ID: | 721139.1741290428@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
> On Thu, Mar 06, 2025 at 01:47:34PM -0500, Tom Lane wrote:
>> ... I wonder if we could just rip out pg_upgrade's support
>> for DB-level parallelism, which is not terribly pretty anyway, and
>> simply pass the -j switch straight to pg_dump and pg_restore.
> That would certainly help for clusters with one big database with many LOs
> or something, but I worry it would hurt the many database case quite a bit.
I'm very skeptical of that. How many DBs do you know with just one table?
I think most have enough that they could keep a reasonable number of
CPUs busy with pg_dump's internal parallelism.
> Maybe we could add a --jobs-per-db option that indicates how to parallelize
> dump/restore. If you set --jobs=8 --jobs-per-db=8, the databases would be
> dumped serially, but pg_dump would get -j8. If you set --jobs=8 and
> --jobs-per-db=2, we'd process 4 databases at a time, each with -j2.
I specifically didn't propose such a thing because I think it will be
a sucky user experience. In the first place, users are unlikely to
take the time to puzzle out exactly how they should slice that up;
in the second place, if they try they won't necessarily find that
there's a good solution with those knobs; in the third place,
pg_upgrade is commonly invoked through packager-supplied scripts that
might not give access to those switches anyway.
In the short term I think repurposing -j as meaning within-DB
parallelism rather than cross-DB parallelism would be a win for the
vast majority of users. We could imagine some future feature that
lets pg_upgrade try to slice up the available jobs on its own
(say, based on a preliminary survey of how many tables in each DB).
But I don't want to build that today, and maybe we won't ever.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-03-06 19:51:26 | Re: Statistics Import and Export |
Previous Message | Nathan Bossart | 2025-03-06 19:44:30 | Re: Back-patch of: avoid multiple hard links to same WAL file after a crash |