Re: optimize file transfer in pg_upgrade

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, bruce(at)momjian(dot)us
Subject: Re: optimize file transfer in pg_upgrade
Date: 2025-03-17 19:34:34
Message-ID: Z9h5Spp76EBygyEL@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 05, 2025 at 08:34:37PM -0600, Nathan Bossart wrote:
> Thank you, Greg and Robert, for sharing your thoughts. With that, here's
> what I'm considering to be a reasonably complete patch set for this
> feature. This leaves about a month for rigorous testing and editing, so
> I'm hopeful it'll be ready v18.

Here are my notes after a round of self-review.

0001:
* The documentation does not adequately describe the interaction between
--no-sync-data-files and --sync-method=syncfs.
* I really don't like the exclude_dir hack for skipping the main tablespace
directory, but I haven't thought of anything that seems better.
* I should verify that there's no path separator issues on Windows for the
exclude_dir hack. From some quick code analysis, I think it should work
fine, but I probably ought to test it out to be sure.
* The documentation needs to mention that the tablespace directories
themselves are not synchronized.

0002:
* The documentation changes are subject to update based on ongoing stats
import/export work.
* Does --statistics-only --sequence-data make any sense? It seems like it
ought to function as expected, but it's hard to see a use-case.

0003:
* Once committed, I should update one of my buildfarm animals to use
PG_TEST_PG_UPGRADE_MODE=--swap.
* For check_hard_link() and disable_old_cluster(), move the Assert() to an
"else" block with a pg_fatal() call for sturdiness.
* I need to do a thorough pass-through on all comments. Many are not
sufficiently detailed.
* The "." and ".." checks in the catalog swap code are redundant and can be
removed.
* The directory for "moved-aside" stuff should be placed within the old
cluster's corresponding tablespace directory so that no changes need to
be made to delete_old_cluster.{sh,bat}.
* Manual testing with non-default tablespaces!

Updated patches based on these notes are attached.

--
nathan

Attachment Content-Type Size
v5-0001-initdb-Add-no-sync-data-files.patch text/plain 13.3 KB
v5-0002-pg_dump-Add-sequence-data.patch text/plain 4.9 KB
v5-0003-pg_upgrade-Add-swap-for-faster-file-transfer.patch text/plain 30.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2025-03-17 19:36:04 Re: Forbid to DROP temp tables of other sessions
Previous Message Bruce Momjian 2025-03-17 19:27:11 Re: optimize file transfer in pg_upgrade