Re: optimizing pg_upgrade's once-in-each-database steps

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Ilya Gladyshev <ilya(dot)v(dot)gladyshev(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: optimizing pg_upgrade's once-in-each-database steps
Date: 2024-08-01 21:41:18
Message-ID: ZqwA_qngQM25FrjK@nathan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 01, 2024 at 12:44:35PM -0500, Nathan Bossart wrote:
> On Wed, Jul 31, 2024 at 10:55:33PM +0100, Ilya Gladyshev wrote:
>> I like your idea of parallelizing these checks with async libpq API, thanks
>> for working on it. The patch doesn't apply cleanly on master anymore, but
>> I've rebased locally and taken it for a quick spin with a pg16 instance of
>> 1000 empty databases. Didn't see any regressions with -j 1, there's some
>> speedup with -j 8 (33 sec vs 8 sec for these checks).
>
> Thanks for taking a look. I'm hoping to do a round of polishing before
> posting a rebased patch set soon.
>
>> One thing that I noticed that could be improved is we could start a new
>> connection right away after having run all query callbacks for the current
>> connection in process_slot, instead of just returning and establishing the
>> new connection only on the next iteration of the loop in async_task_run
>> after potentially sleeping on select.
>
> Yeah, we could just recursively call process_slot() right after freeing the
> slot. That'd at least allow us to avoid the spinning behavior as we run
> out of databases to process, if nothing else.

Here is a new patch set. Besides rebasing, I've added the recursive call
to process_slot() mentioned in the quoted text, and I've added quite a bit
of commentary to async.c.

--
nathan

Attachment Content-Type Size
v7-0001-introduce-framework-for-parallelizing-pg_upgrade-.patch text/plain 16.5 KB
v7-0002-use-new-pg_upgrade-async-API-for-subscription-sta.patch text/plain 9.0 KB
v7-0003-use-new-pg_upgrade-async-API-for-retrieving-relin.patch text/plain 12.9 KB
v7-0004-use-new-pg_upgrade-async-API-to-parallelize-getti.patch text/plain 3.3 KB
v7-0005-use-new-pg_upgrade-async-API-to-parallelize-repor.patch text/plain 3.4 KB
v7-0006-parallelize-data-type-checks-in-pg_upgrade.patch text/plain 12.4 KB
v7-0007-parallelize-isn-and-int8-passing-mismatch-check-i.patch text/plain 3.7 KB
v7-0008-parallelize-user-defined-postfix-ops-check-in-pg_.patch text/plain 5.3 KB
v7-0009-parallelize-incompatible-polymorphics-check-in-pg.patch text/plain 8.7 KB
v7-0010-parallelize-tables-with-oids-check-in-pg_upgrade.patch text/plain 3.5 KB
v7-0011-parallelize-user-defined-encoding-conversions-che.patch text/plain 4.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-08-01 22:02:27 Re: proposal: schema variables
Previous Message Andrew Dunstan 2024-08-01 21:38:17 Re: Official devcontainer config