Re: optimizing pg_upgrade's once-in-each-database steps

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: optimizing pg_upgrade's once-in-each-database steps
Date: 2024-07-01 19:46:56
Message-ID: ZoMHsBasBy8LqkEU@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I figured I'd post what I have so far since this thread hasn't been updated
in a while. The attached patches are still "proof-of-concept grade," but
they are at least moving in the right direction (IMHO). The variable
naming is still not great, and they are woefully undercommented, among
other things.

0001 introduces a new API for registering callbacks and running them in
parallel on all databases in the cluster. This new system manages a set of
"slots" that follow a simple state machine to asynchronously establish a
connection and run the queries. It uses system() to wait for these
asynchronous tasks to complete. Users of this API only need to provide two
callbacks: one to return the query that should be run on each database and
another to process the results of that query. If multiple queries are
required for each database, users can provide multiple sets of callbacks.

The other patches change several of the existing tasks to use this new API.
With these patches applied, I see the following differences in the output
of 'pg_upgrade | ts -i' for a cluster with 1k empty databases:

WITHOUT PATCH

00:00:19 Checking database user is the install user ok
00:00:02 Checking for subscription state ok
00:00:06 Adding ".old" suffix to old global/pg_control ok
00:00:04 Checking for extension updates ok

WITH PATCHES (--jobs 1)

00:00:10 Checking database user is the install user ok
00:00:02 Checking for subscription state ok
00:00:07 Adding ".old" suffix to old global/pg_control ok
00:00:05 Checking for extension updates ok

WITH PATCHES (--jobs 4)

00:00:06 Checking database user is the install user ok
00:00:00 Checking for subscription state ok
00:00:02 Adding ".old" suffix to old global/pg_control ok
00:00:01 Checking for extension updates ok

Note that the "Checking database user is the install user" time also
includes the call to get_db_rel_and_slot_infos() on the old cluster as well
as the call to get_loadable_libraries() on the old cluster. I believe the
improvement with the patches with just one job is due to the consolidation
of the queries into one database connection (presently,
get_db_rel_and_slot_infos() creates 3 connections per database for some
upgrades). Similarly, the "Adding \".old\" suffix to old
global/pg_control" time includes the call to get_db_rel_and_slot_infos() on
the new cluster.

There are several remaining places where we could use this new API to speed
up upgrades. For example, I haven't attempted to use it for the data type
checks yet, and that tends to eat up a sizable chunk of time when there are
many databases.

On Thu, May 16, 2024 at 08:24:08PM -0500, Nathan Bossart wrote:
> On Thu, May 16, 2024 at 05:09:55PM -0700, Jeff Davis wrote:
>> Also, did you consider connecting once to each database and running
>> many queries? Most of those seem like just checks.
>
> This was the idea behind 347758b. It may be possible to do more along
> these lines. IMO parallelizing will still be useful even if we do combine
> more of the steps.

My current thinking is that any possible further consolidation should
happen as part of a follow-up effort to parallelization. I'm cautiously
optimistic that the parallelization work will make the consolidation easier
since it moves things to rigidly-defined callback functions.

A separate piece of off-list feedback from Michael Paquier is that this new
parallel system might be something we can teach the ParallelSlot code used
by bin/scripts/ to do. I've yet to look too deeply into this, but I
suspect that it will be difficult to combine the two. For example, the
ParallelSlot system doesn't seem well-suited for the kind of
run-once-in-each-database tasks required by pg_upgrade, and the error
handling is probably little different, too. However, it's still worth a
closer look, and I'm interested in folks' opinions on the subject.

--
nathan

Attachment Content-Type Size
v2-0001-introduce-framework-for-parallelizing-pg_upgrade-.patch text/plain 10.0 KB
v2-0002-use-new-pg_upgrade-async-API-for-subscription-sta.patch text/plain 8.8 KB
v2-0003-move-live_check-variable-to-user_opts.patch text/plain 11.8 KB
v2-0004-use-new-pg_upgrade-async-API-for-retrieving-relin.patch text/plain 11.1 KB
v2-0005-use-new-pg_upgrade-async-API-to-parallelize-getti.patch text/plain 3.4 KB
v2-0006-use-new-pg_upgrade-async-API-to-parallelize-repor.patch text/plain 3.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2024-07-01 19:47:43 Re: Avoid incomplete copy string (src/backend/access/transam/xlog.c)
Previous Message Jeff Davis 2024-07-01 19:24:15 Re: Built-in CTYPE provider