Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-08-24 09:50:39
Message-ID: CAHut+Ps5=9q1CCyrrytyv-8oUBqE6rv-=YFSRuuQwVf+smC-Kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Notwithstanding the test errors I am getting for v24-0003, here are
some code review comments for this patch anyway.

======
src/bin/pg_upgrade/check.c

1. check_for_lost_slots

+
+/*
+ * Verify that all logical replication slots are usable.
+ */
+void
+check_for_lost_slots(ClusterInfo *cluster)

1a.
AFAIK we don't ever need to call this also for 'new_cluster'. So the
function should have no parameter and just access 'old_cluster'
directly.

~

1b.
Can't this be a static function now?

~

2.
+ for (i = 0; i < ntups; i++)
+ pg_log(PG_WARNING,
+ "\nWARNING: logical replication slot \"%s\" is in 'lost' state.",
+ PQgetvalue(res, i, i_slotname));

Is it correct that this message also includes the word "WARNING"?
Other PG_WARNING messages don't do that.

~~~

3. check_for_confirmed_flush_lsn

+/*
+ * Verify that all logical replication slots consumed all WALs, except a
+ * CHECKPOINT_SHUTDOWN record.
+ */
+static void
+check_for_confirmed_flush_lsn(ClusterInfo *cluster)

AFAIK we don't ever need to call this also for 'new_cluster'. So the
function should have no parameter and just access 'old_cluster'
directly.

~

4.
+ for (i = 0; i < ntups; i++)
+ pg_log(PG_WARNING,
+ "\nWARNING: logical replication slot \"%s\" has not consumed WALs yet",
+ PQgetvalue(res, i, i_slotname));

Is it correct that this message also includes the word "WARNING"?
Other PG_WARNING messages don't do that.

======
src/bin/pg_upgrade/controldata.c

5. get_control_data

+ else if ((p = strstr(bufin, "Latest checkpoint location:")) != NULL)
+ {
+ /*
+ * Gather the latest checkpoint location if the cluster is PG17
+ * or later. This is used for upgrading logical replication
+ * slots.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)

But we are not "gathering" anything. It's just one LSN. I think this
ought to just say "Read the latest..."

~

6.
+ /*
+ * The upper and lower part of LSN must be read separately
+ * because it is reported in %X/%X format.
+ */

/reported/stored as/

======
src/bin/pg_upgrade/pg_upgrade.h

7.
+void check_for_lost_slots(ClusterInfo *cluster);\

Why is this needed here? Can't this be a static function?

======
.../t/003_logical_replication_slots.pl

8.
+# 2. Consume WAL records to avoid another type of upgrade failure. It will be
+# tested in subsequent cases.
+$old_publisher->safe_psql('postgres',
+ "SELECT count(*) FROM pg_logical_slot_get_changes('test_slot1', NULL, NULL);"
+);

I wondered if that step really needed. Why will there be WAL records to consume?

IIUC we haven't published anything yet.

~~~

9.
+# ------------------------------
+# TEST: Successful upgrade
+
+# Preparations for the subsequent test:
+# 1. Remove the remained slot
+$old_publisher->start;
+$old_publisher->safe_psql('postgres',
+ "SELECT * FROM pg_drop_replication_slot('test_slot1');"
+);

Should removal of the slot be done as part of the cleanup of the
previous test, instead of preparing for this one?

~~~

10.
# 3. Disable the subscription once
$subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
$old_publisher->stop;

10a.
What do you mean by "once"?

~

10b.
That old_publisher->stop; seems strangely placed. Why is it here?

~~~

11.
# Check that the slot 'test_slot1' has migrated to the new cluster
$new_publisher->start;
my $result = $new_publisher->safe_psql('postgres',
"SELECT slot_name, two_phase FROM pg_replication_slots");
is($result, qq(sub|t), 'check the slot exists on new cluster');

~

That comment now seems wrong. That slot was previously removed, right?

~~~

12.
# Update the connection
my $new_connstr = $new_publisher->connstr . ' dbname=postgres';
$subscriber->safe_psql('postgres',
"ALTER SUBSCRIPTION sub CONNECTION '$new_connstr'");
$subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE");

~

Maybe better to combine both SQL.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-08-24 10:05:17 Re: Adding argument names to aggregate functions
Previous Message Richard Guo 2023-08-24 09:27:21 Re: Support run-time partition pruning for hash join