From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Date: | 2023-08-30 05:27:55 |
Message-ID: | CAHut+Pv=O-kn24Jnr4ingv3nesL_=G9jhCcQhv5HG7ktJvi5aQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Kuroda-san.
Here are some review comments for v28-0003.
======
src/bin/pg_upgrade/check.c
1. check_and_dump_old_cluster
+ /*
+ * Logical replication slots can be migrated since PG17. See comments atop
+ * get_old_cluster_logical_slot_infos().
+ */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+ check_old_cluster_for_valid_slots(live_check);
+
IIUC we are preferring to use the <= 1600 style of version check
instead of >= 1700 where possible.
So this comment and version check ought to be removed from here, and
done inside check_old_cluster_for_valid_slots() instead.
~~~
2. check_old_cluster_for_valid_slots
+/*
+ * check_old_cluster_for_valid_slots()
+ *
+ * Make sure logical replication slots can be migrated to new cluster.
+ * Following points are checked:
+ *
+ * - All logical replication slots are usable.
+ * - All logical replication slots consumed all WALs, except a
+ * CHECKPOINT_SHUTDOWN record.
+ */
+static void
+check_old_cluster_for_valid_slots(bool live_check)
I suggested in the previous comment above (#1) that the version check
should be moved into this function.
Therefore, this function comment now should also mention slot upgrade
is only allowed for >= PG17
~~~
3.
+static void
+check_old_cluster_for_valid_slots(bool live_check)
+{
+ int i,
+ ntups,
+ i_slotname;
+ PGresult *res;
+ DbInfo *active_db = &old_cluster.dbarr.dbs[0];
+ PGconn *conn;
+
+ /* Quick exit if the cluster does not have logical slots. */
+ if (count_old_cluster_logical_slots() == 0)
+ return;
3a.
See comment #1. At the top of this function body there should be a
version check like:
if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
return;
~
3b.
/Quick exit/Quick return/
~
4.
+ prep_status("Checking for logical replication slots");
I felt that should add the word "valid" like:
"Checking for valid logical replication slots"
~~~
5.
+ /* Check there are no logical replication slots with a 'lost' state. */
+ res = executeQueryOrDie(conn,
+ "SELECT slot_name FROM pg_catalog.pg_replication_slots "
+ "WHERE wal_status = 'lost' AND "
+ "temporary IS FALSE;");
Since the SQL is checking if there *are* lost slots I felt it would be
more natural to reverse that comment.
SUGGESTION
/* Check and reject if there are any logical replication slots with a
'lost' state. */
~~~
6.
+ /*
+ * Do additional checks if a live check is not required. This requires
+ * that confirmed_flush_lsn of all the slots is the same as the latest
+ * checkpoint location, but it would be satisfied only when the server has
+ * been shut down.
+ */
+ if (!live_check)
I think the comment can be rearranged slightly:
SUGGESTION
Do additional checks to ensure that 'confirmed_flush_lsn' of all the
slots is the same as the latest checkpoint location.
Note: This can be satisfied only when the old_cluster has been shut
down, so we skip this for "live" checks.
======
src/bin/pg_upgrade/controldata.c
7.
+ /*
+ * Read the latest checkpoint location if the cluster is PG17
+ * or later. This is used for upgrading logical replication
+ * slots.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
+ {
Fetching this "Latest checkpoint location:" value is only needed for
the check_old_cluster_for_valid_slots validation check, isn't it? But
AFAICT this code is common for both old_cluster and new_cluster.
I am not sure what is best to do:
- Do only the minimal logic needed?
- Read the value redundantly even for new_cluster just to keep code simpler?
Either way, maybe the comment should say something about this.
======
.../t/003_logical_replication_slots.pl
8. Consider adding one more test
Maybe there should also be some "live check" test performed (e.g.
using --check, and a running old_cluster).
This would demonstrate pg_upgrade working successfully even when the
WAL records are not consumed (because LSN checks would be skipped in
check_old_cluster_for_valid_slots function).
------
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2023-08-30 06:42:00 | [17] CREATE SUBSCRIPTION ... SERVER |
Previous Message | Andy Fan | 2023-08-30 04:47:35 | Re: Extract numeric filed in JSONB more effectively |