Re: optimize file transfer in pg_upgrade

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: optimize file transfer in pg_upgrade
Date: 2024-11-19 18:20:00
Message-ID: ZzzW0AGjUG8xVhux@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 17, 2024 at 01:50:53PM -0500, Greg Sabino Mullane wrote:
> On Wed, Nov 6, 2024 at 5:07 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
> wrote:
>> Therefore, it can be much faster to instead move the entire data directory
>> from the old cluster
>> to the new cluster and to then swap the catalog relation files.
>
> Thank you for breaking this up so clearly into separate commits. I think it
> is a very interesting idea, and anything to speed up pg_upgrade is always
> welcome. Some minor thoughts:

Thank you for reviewing!

>> .. we don't really expect there to be directories within database
> directories,
>> so perhaps it would be better to either unconditionally rename or to fail.
> Failure seems the best option here, so we can cleanly handle any future
> cases in which we decide to put dirs in this directory.

Good point.

>> if (RelFileNumberIsValid(rfn))
>> {
>> FileNameMap key;
>>
>> key.relfilenumber = (RelFileNumber) rfn;
>> if (bsearch(&key, context->maps, context->size,
>> sizeof(FileNameMap), FileNameMapCmp))
>> return 0;
>> }
>>
>> snprintf(dst, sizeof(dst), "%s/%s", context->target, filename);
>> if (rename(fname, dst) != 0)
>
> I'm not quite clear what we are doing here with falling through
> for InvalidOid entries, could you explain?

The idea is that if it looks like a data file that we might want to
transfer (i.e., it starts with a RelFileNumber), we should consult our map
to determine whether to move it. Otherwise, we want to unconditionally
transfer it so that we always use the files generated during pg_restore in
the new cluster (e.g., PG_VERSION and pg_filenode.map). In theory, this
should result in the same end state as what --link mode does today (for the
new cluster, at least).

>> .. vm_must_add_frozenbit isn't handled yet. We could either disallow
>> using catalog-swap mode if the upgrade involves versions older than v9.6
>
> Yes, this. No need for more code to handle super old versions when other
> options exist.

I'm inclined to agree.

>> with this problem is to introduce a special mode for "initdb --sync-only"
>> that calls fsync() for everything _except_ the actual data files. If we
>> fsync() the new catalog files as we move them into place, and if we assume
>> that the old catalog files will have been properly synchronized before
>> upgrading, there's no reason to synchronize them again at the end.
>
> Very cool approach!

:)

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-11-19 18:29:06 Re: optimize file transfer in pg_upgrade
Previous Message Guillaume Lelarge 2024-11-19 18:13:58 Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE