Re: optimizing pg_upgrade's once-in-each-database steps

From: Ilya Gladyshev <ilya(dot)v(dot)gladyshev(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: optimizing pg_upgrade's once-in-each-database steps
Date: 2024-08-04 18:19:57
Message-ID: 60cbc507-4c36-4037-9af7-028b1c8455b5@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 01.08.2024 22:41, Nathan Bossart wrote:
> Here is a new patch set. Besides rebasing, I've added the recursive call
> to process_slot() mentioned in the quoted text, and I've added quite a bit
> of commentary to async.c.
That's much better now, thanks! Here's my code review, note that I
haven't tested the patches yet:

+void
+async_task_add_step(AsyncTask *task,
+                    AsyncTaskGetQueryCB query_cb,
+                    AsyncTaskProcessCB process_cb, bool free_result,
+                    void *arg)

Is there any reason to have query as a callback function instead of char
*? From what I see right now, it doesn't give any extra flexibility, as
the query has to be static anyway (can't be customized on a per-database
basis) and will be created once before all the callbacks are run. While
passing in char * makes the API simpler, excludes any potential error of
making the query dependent on the current database and removes the
unnecessary malloc/free of the static strings.

+static void
+dispatch_query(const ClusterInfo *cluster, AsyncSlot *slot,
+               const AsyncTask *task)
+{
+    ...
+    if (!PQsendQuery(slot->conn, cbs->query))
+        conn_failure(slot->conn);
+}

This will print "connection failure: connection pointer is NULL", which
I don't think makes a lot of sense to the end user. I'd prefer something
like pg_fatal("failed to allocate a new connection").

     if (found)
-        pg_fatal("Data type checks failed: %s", report.data);
+    {
+        pg_fatal("Data type checks failed: %s",
data_type_check_report.data);
+        termPQExpBuffer(&data_type_check_report);
+    }

`found` should be removed and replaced with `data_type_check_failed`, as
it's not set anymore. Also the termPQExpBuffer after pg_fatal looks
unnecessary.

+static bool *data_type_check_results;
+static bool data_type_check_failed;
+static PQExpBufferData data_type_check_report;

IMO, it would be nicer to have these as a local state, that's passed in
as an arg* to the AsyncTaskProcessCB, which aligns with how the other
checks do it.

-- End of review --

Regarding keeping the connections, the way I envisioned it entailed
passing a list of connections from one check to the next one (or keeping
a global state with connections?). I didn't concretely look at the code
to verify this, so it's just an abstract idea.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-08-04 19:32:20 Re: Thoughts about NUM_BUFFER_PARTITIONS
Previous Message Jeff Davis 2024-08-04 15:36:28 Re: Comments on Custom RMGRs