From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: [PoC] pg_upgrade: allow to upgrade publisher node |
Date: | 2023-08-22 00:31:25 |
Message-ID: | CAHut+Ps+Varx8uUF5c7YMXCSMO6BZ8SUXc9pv5iXkM2rqh7DEA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Kuroda-san,
Here are some review comments for v22-0003.
(FYI, I was already mid-way through this review before you posted new v23*
patches, so I am posting it anyway in case some comments still apply.)
======
src/bin/pg_upgrade/check.c
1. check_for_lost_slots
+ /* logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(cluster->major_version) <= 1600)
+ return;
1a
Maybe the comment should start uppercase for consistency with others.
~
1b.
IMO if you check < 1700 instead of <= 1600 it will be a better match with
the comment.
~~~
2. check_for_lost_slots
+ for (i = 0; i < ntups; i++)
+ {
+ pg_log(PG_WARNING,
+ "\nWARNING: logical replication slot \"%s\" is in 'lost' state.",
+ PQgetvalue(res, i, i_slotname));
+ }
+
+
The braces {} are not needed anymore
~~~
3. check_for_confirmed_flush_lsn
+ /* logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(cluster->major_version) <= 1600)
+ return;
3a.
Maybe the comment should start uppercase for consistency with others.
~
3b.
IMO if you check < 1700 instead of <= 1600 it will be a better match with
the comment.
~~~
4. check_for_confirmed_flush_lsn
+ for (i = 0; i < ntups; i++)
+ {
+ pg_log(PG_WARNING,
+ "\nWARNING: logical replication slot \"%s\" has not consumed WALs yet",
+ PQgetvalue(res, i, i_slotname));
+ }
+
The braces {} are not needed anymore
======
src/bin/pg_upgrade/controldata.c
5. get_control_data
+ /*
+ * Gather latest checkpoint location if the cluster is newer or
+ * equal to 17. This is used for upgrading logical replication
+ * slots.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 17)
5a.
/newer or equal to 17/PG17 or later/
~~~
5b.
>= 17 should be >= 1700
~~~
6. get_control_data
+ {
+ char *slash = NULL;
+ uint64 upper_lsn, lower_lsn;
+
+ p = strchr(p, ':');
+
+ if (p == NULL || strlen(p) <= 1)
+ pg_fatal("%d: controldata retrieval problem", __LINE__);
+
+ p++; /* remove ':' char */
+
+ p = strpbrk(p, "01234567890ABCDEF");
+
+ /*
+ * Upper and lower part of LSN must be read separately
+ * because it is reported as %X/%X format.
+ */
+ upper_lsn = strtoul(p, &slash, 16);
+ lower_lsn = strtoul(++slash, NULL, 16);
+
+ /* And combine them */
+ cluster->controldata.chkpnt_latest =
+ (upper_lsn << 32) | lower_lsn;
+ }
Should 'upper_lsn' and 'lower_lsn' be declared as uint32? That seems a
better mirror for LSN_FORMAT_ARGS.
======
src/bin/pg_upgrade/info.c
7. get_logical_slot_infos
+
+ /*
+ * Do additional checks if slots are found on the old node. If something is
+ * found on the new node, a subsequent function
+ * check_new_cluster_is_empty() would report the name of slots and raise a
+ * fatal error.
+ */
+ if (cluster == &old_cluster && slot_count)
+ {
+ check_for_lost_slots(cluster);
+
+ if (!live_check)
+ check_for_confirmed_flush_lsn(cluster);
+ }
It somehow doesn't feel right for these extra checks to be jammed into this
function, just because you conveniently have the slot_count available.
On the NEW cluster side, there was extra checking in the
check_new_cluster() function.
For consistency, I think this OLD cluster checking should be done in the
check_and_dump_old_cluster() function -- see the "Check for various failure
cases" comment -- IMO this new fragment belongs there with the other checks.
======
src/bin/pg_upgrade/pg_upgrade.h
8.
bool date_is_int;
bool float8_pass_by_value;
uint32 data_checksum_version;
+
+ XLogRecPtr chkpnt_latest;
} ControlData;
I don't think the new field is particularly different from all the others
that it needs a blank line separator.
======
.../t/003_logical_replication_slots.pl
9.
# Initialize old node
my $old_publisher = PostgreSQL::Test::Cluster->new('old_publisher');
$old_publisher->init(allows_streaming => 'logical');
-$old_publisher->start;
# Initialize new node
my $new_publisher = PostgreSQL::Test::Cluster->new('new_publisher');
$new_publisher->init(allows_streaming => 'replica');
-my $bindir = $new_publisher->config_data('--bindir');
+# Initialize subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$subscriber->init(allows_streaming => 'logical');
-$old_publisher->stop;
+my $bindir = $new_publisher->config_data('--bindir');
~
Are those removal of the old_publisher start/stop changes that actually
should be done in the 0002 patch?
~~~
10.
$old_publisher->safe_psql(
'postgres', qq[
SELECT pg_create_logical_replication_slot('test_slot2', 'test_decoding',
false, true);
+ SELECT count(*) FROM pg_logical_slot_get_changes('test_slot1', NULL,
NULL);
]);
~
What is the purpose of the added SELECT? It doesn't seem covered by the
comment.
~~~
11.
# Remove an unnecessary slot and generate WALs. These records would not be
# consumed before doing pg_upgrade, so that the upcoming test would fail.
$old_publisher->start;
$old_publisher->safe_psql(
'postgres', qq[
SELECT pg_drop_replication_slot('test_slot2');
CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
]);
$old_publisher->stop;
Minor rewording of comment sentence.
SUGGESTION
Because these WAL records do not get consumed it will cause the upcoming
pg_upgrade test to fail.
~~~
12.
# Cause a failure at the start of pg_upgrade because the slot still have
# unconsumed WAL records
~
/still have/still has/
------
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-08-22 01:44:07 | Re: should frontend tools use syncfs() ? |
Previous Message | Michael Paquier | 2023-08-21 23:58:42 | Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue |