Quick Links

Re: optimizing pg_upgrade's once-in-each-database steps

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	Ilya Gladyshev <ilya(dot)v(dot)gladyshev(at)gmail(dot)com>
Cc:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: optimizing pg_upgrade's once-in-each-database steps
Date:	2024-09-01 21:05:21
Message-ID:	ZtTXEexwmlvsgsPs@nathan
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Aug 31, 2024 at 01:18:10AM +0100, Ilya Gladyshev wrote:
> LGTM in general, but here are some final nitpicks:

Thanks for reviewing.

> + if (maxFd != 0)
> + (void) select(maxFd + 1, &input_mask, &output_mask, &except_mask, NULL);
>
> It´s a good idea to check for the return value of select, in case it
> returns any errors.

Done.

> + dbs_complete++;
> + (void) PQgetResult(slot->conn);
> + PQfinish(slot->conn);
>
> Perhaps it´s rather for me to understand, what does PQgetResult call do
> here?

I believe I was trying to follow the guidance that you should always call
PQgetResult() until it returns NULL, but looking again, I don't see any
need to call it since we free the connection immediately afterwards.

> + /* Check for connection failure. */
> + if (PQconnectPoll(slot->conn) == PGRES_POLLING_FAILED)
> + pg_fatal("connection failure: %s", PQerrorMessage(slot->conn));
> +
> + /* Check whether the connection is still establishing. */
> + if (PQconnectPoll(slot->conn) != PGRES_POLLING_OK)
> + return;
>
> Are the two consecutive calls of PQconnectPoll intended here? Seems a bit
> wasteful, but maybe that´s ok.

I think we can actually just use PQstatus() here. But furthermore, I think
the way I was initiating connections was completely bogus. IIUC before
calling PQconnectPoll() the first time, we should wait for a write
indicator from select(), and then we should only call PQconnectPoll() after
subsequent indicators from select(). After spending quite a bit of time
staring at the PQconnectPoll() code, I'm quite surprised I haven't seen any
problems thus far. If I had to guess, I'd say that either PQconnectPoll()
is more resilient than I think it is, or I've just gotten lucky because
pg_upgrade establishes connections quickly.

Anyway, to fix this, I've added some more fields to the slot struct to
keep track of the information we need to properly establish connections,
and we now pay careful attention to the return value of select() so that we
know which slots are ready for processing. This seemed like a nice little
optimization independent of fixing connection establishment. I was worried
this was going to require a lot more code, but I think it only added ~50
lines or so.

> We should probably mention this change in the docs as well, I found these
> two places that I think can be improved:

I've adjusted the documentation in v11.

--
nathan

Attachment	Content-Type	Size
v11-0001-Introduce-framework-for-parallelizing-various-pg.patch	text/plain	18.6 KB
v11-0002-Use-pg_upgrade-s-new-parallel-framework-for-subs.patch	text/plain	9.2 KB
v11-0003-Use-pg_upgrade-s-new-parallel-framework-to-get-r.patch	text/plain	15.7 KB
v11-0004-Use-pg_upgrade-s-new-parallel-framework-to-get-l.patch	text/plain	3.8 KB
v11-0005-Use-pg_upgrade-s-new-parallel-framework-for-exte.patch	text/plain	4.2 KB
v11-0006-Use-pg_upgrade-s-new-parallel-framework-for-data.patch	text/plain	13.5 KB
v11-0007-Use-pg_upgrade-s-new-parallel-framework-for-isn-.patch	text/plain	4.7 KB
v11-0008-Use-pg_upgrade-s-new-parallel-framework-for-post.patch	text/plain	6.3 KB
v11-0009-Use-pg_upgrade-s-new-parallel-framework-for-poly.patch	text/plain	7.5 KB
v11-0010-Use-pg_upgrade-s-new-parallel-framework-for-WITH.patch	text/plain	4.4 KB
v11-0011-Use-pg_upgrade-s-parallel-framework-for-encoding.patch	text/plain	5.5 KB

In response to

Re: optimizing pg_upgrade's once-in-each-database steps at 2024-08-31 00:18:10 from Ilya Gladyshev

Responses

Re: optimizing pg_upgrade's once-in-each-database steps at 2024-09-03 18:39:25 from Nathan Bossart
Re: optimizing pg_upgrade's once-in-each-database steps at 2024-09-03 23:28:23 from Ilya Gladyshev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michail Nikolaev	2024-09-01 21:19:00	Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Previous Message	Alexander Korotkov	2024-09-01 19:35:27	Re: pgsql: Implement pg_wal_replay_wait() stored procedure