From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Давыдов Виталий <v(dot)davydov(at)postgrespro(dot)ru> |
Subject: | Re: Slow catchup of 2PC (twophase) transactions on replica in LR |
Date: | 2024-05-14 04:18:45 |
Message-ID: | CAHut+PssnvZ86cZw5WGEoOU3mAzzMEBwYcyFG9Th0XM=n49H5A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Kuroda-san, Here are some review comments for all patches v9*
//////////
Patch v9-0001
//////////
There were no changes since v8-0001, so no comments.
//////////
Patch v9-0002
//////////
======
Commit Message
2.1.
Regarding the off->on case, the logical replication already has a
mechanism for it, so there is no need to do anything special for the
on->off case; however, we must connect to the publisher and expressly
change the parameter. The operation cannot be rolled back, and
altering the parameter from "on" to "off" within a transaction is
prohibited.
In the opposite case, there is no need to prevent this because the
logical replication worker already had the mechanism to alter the slot
option at a convenient time.
~
This explanation seems to be going around in circles, without giving
any new information:
AFAICT, "Regarding the off->on case, the logical replication already
has a mechanism for it, so there is no need to do anything special for
the on->off case;"
is saying pretty much the same as:
"In the opposite case, there is no need to prevent this because the
logical replication worker already had the mechanism to alter the slot
option at a convenient time."
But, what I hoped for in previous review comments was an explanation
somewhat less vague than "already has a mechanism" or "already had the
mechanism". Can't this have just 1 or 2 lines to say WHAT is that
existing mechanism for the "off" to "on" case, and WHY that means
there is nothing special to do in that scenario?
======
src/backend/commands/subscriptioncmds.c
2.2. AlterSubscription
/*
- * The changed two_phase option of the slot can't be rolled
- * back.
+ * Since altering the two_phase option of subscriptions
+ * also leads to changing the slot option, this command
+ * cannot be rolled back. So prevent this if we are in a
+ * transaction block. In the opposite case, there is no
+ * need to prevent this because the logical replication
+ * worker already had the mechanism to alter the slot
+ * option at a convenient time.
*/
(Same previous review comments, and same as my review comment for the
commit message above).
I don't think "already had the mechanism" is enough explanation.
Also, the 2nd sentence doesn't make sense here because the comment
only said "altering the slot option" -- it didn't say it was altering
it to "on" or altering it to "off", so "the opposite case" has no
meaning.
~~~
2.3. AlterSubscription
/*
- * Try to acquire the connection necessary for altering slot.
+ * Check the need to alter the replication slot. Failover and two_phase
+ * options are controlled by both the publisher (as a slot option) and the
+ * subscriber (as a subscription option). The slot option must be altered
+ * only when changing "on" to "off". Because in opposite case, the logical
+ * replication worker already has the mechanism to do so at a convenient
+ * time.
+ */
+ update_failover = replaces[Anum_pg_subscription_subfailover - 1];
+ update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
+ !opts.twophase);
This is again the same as other review comments above. Probably, when
some better explanation can be found for "already has the mechanism to
do so at a convenient time." then all of these places can be changed
using similar text.
//////////
Patch v9-0003
//////////
There are some imperfect code comments but AFAIK they are the same
ones from patch 0002. I think patch 0003 is just moving those comments
to different places, so probably they would already be addressed by
patch 0002.
//////////
Patch v9-0004
//////////
======
doc/src/sgml/catalogs.sgml
4.1.
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>subforcealter</structfield> <type>bool</type>
+ </para>
+ <para>
+ If true, the subscription can be altered <literal>two_phase</literal>
+ option, even if there are prepared transactions
+ </para></entry>
+ </row>
+
BEFORE
If true, the subscription can be altered <literal>two_phase</literal>
option, even if there are prepared transactions
SUGGESTION
If true, then the ALTER SUBSCRIPTION command can disable
<literal>two_phase</literal> option, even if there are uncommitted
prepared transactions from when <literal>two_phase</literal> was
enabled
======
doc/src/sgml/ref/alter_subscription.sgml
4.2.
-
- <para>
- The <literal>two_phase</literal> parameter can only be altered when the
- subscription is disabled. When altering the parameter from
<literal>on</literal>
- to <literal>off</literal>, the backend process checks for any incomplete
- prepared transactions done by the logical replication worker (from when
- <literal>two_phase</literal> parameter was still <literal>on</literal>)
- and, if any are found, those are aborted.
- </para>
Well, I still think there ought to be some mention of the relationship
between 'force_alter' and 'two_phase' given on this ALTER SUBSCRIPTION
page. Then the user can cross-reference to read what the 'force_alter'
actually does.
======
doc/src/sgml/ref/create_subscription.sgml
4.3.
+
+ <varlistentry id="sql-createsubscription-params-with-force-alter">
+ <term><literal>force_alter</literal> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ Specifies whether the subscription can be altered
+ <literal>two_phase</literal> option, even if there are prepared
+ transactions. If specified, the backend process checks for any
+ incomplete prepared transactions done by the logical replication
+ worker (from when <literal>two_phase</literal> parameter was still
+ <literal>on</literal>), if any are found, those are aborted.
+ Otherwise, Altering the parameter from <literal>on</literal> to
+ <literal>off</literal> will give an error when there are prepared
+ transactions done by the logical replication worker.
+ The default is <literal>false</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
This explanation seems a bit repetitive. I think it can be improved as follows:
SUGGESTION
Specifies if the ALTER SUBSCRIPTION can be forced to proceed instead
of giving an error.
There is currently only one scenario where this parameter has any
effect: When altering two_phase option from true to false it is
possible for there to be incomplete prepared transactions done by the
logical replication worker (from when two_phase parameter was still
true). If force_alter is false, then this will give an error; if
force_alter is true, then the incomplete prepared transactions are
aborted and the alter will proceed.
The default is false.
======
src/backend/commands/subscriptioncmds.c
4.4. CreateSubscription
values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
+ values[Anum_pg_subscription_subforcealter] = BoolGetDatum(opts.force_alter);
values[Anum_pg_subscription_subconninfo - 1] =
Hmm, looks like a bug. Shouldn't that index say -1?
~~~
4.5. AlterSubscription
+ /*
+ * Abort prepared transactions only if
+ * 'force_alter' option is true. Otherwise raise
+ * an ERROR.
+ */
+ if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER))
+ {
+ if (!opts.force_alter)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = off"),
+ errhint("Resolve these transactions or set %s, and then try again.",
+ "force_alter = true")));
+ }
+ else
+ {
+ if (!sub->forcealter)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = off"),
+ errhint("Resolve these transactions or set %s, and then try again.",
+ "force_alter = true")));
+ }
+
IIUC this code can be simplified to remove the error duplication.
Something like below:
SUGGESTION
bool raise_error = IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) ?
!opts.force_alter : !sub->forcealter;
if (raise_error)
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot alter %s when there are prepared transactions",
"two_phase = off"),
errhint("Resolve these transactions or set %s, and then try again.",
"force_alter = true")));
======
src/bin/pg_dump/pg_dump.c
4.6. getSubscriptions
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ " s.subforcealter\n");
+ else
+ appendPQExpBuffer(query,
+ " false AS subforcealter\n");
+
+
4.6a.
Should this just be combined with the existing "if
(fout->remoteVersion >= 170000)" for failover?
~
4.6b.
Double blank lines.
======
src/bin/psql/describe.c
4.7.
+ if (pset.sversion >= 170000)
+ appendPQExpBuffer(&buf,
+ ", subforcealter AS \"%s\"\n",
+ gettext_noop("Force_alter"));
IMO the column title should be "Force alter" (i.e. without the underscore)
======
src/include/catalog/pg_subscription.h
4.8. CATALOG
+ bool subforcealter; /* True if we allow to drop prepared transactions
+ * when altering two_phase "on"->"off". */
I think this is not actually the description of 'force_alter'. What
you wrote just happens to be the only case that this option currently
works for. Maybe a more correct description is something like below.
SUGGESTION
True allows the ALTER SUBSCRIPTION command to proceed under conditions
that would otherwise result in an error. Currently, 'force_alter' only
has an effect when altering the two_phase option from "true" to
"false".
~~~
4.9. struct Subscription
+ bool forcealter; /* True if we allow to drop prepared
+ * transactions when altering two_phase
+ * "on"->"off". */
Ditto the previous review comment.
======
src/test/regress/expected/subscription.out
4.10.
-
List of subscriptions
- Name | Owner | Enabled | Publication
| Binary | Streaming | Two-phase commit | Disable on error | Origin |
Password required | Run as owner? | Failover | Synchronous commit |
Conninfo | Skip LSN
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f | {testpub}
| f | off | d | f | none |
t | f | f | off |
dbname=regress_doesnotexist | 0/0
+
List of
subscriptions
+ Name | Owner | Enabled | Publication
| Binary | Streaming | Two-phase commit | Disable on error | Origin |
Password required | Run as owner? | Failover | Force_alter |
Synchronous commit | Conninfo | Skip LSN
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+-------------+--------------------+-----------------------------+----------
The column heading should be "Force alter", as already mentioned by an
earlier review comment (#4.7)
======
src/test/subscription/t/099_twophase_added.pl
4.11.
+# Alter the two_phase with the force_alter option. Apart from the the last
+# ALTER SUBSCRIPTION command, the command will abort the prepared transaction
+# and succeed.
There is typo "the the" and the wording is a bit strange. Why not just say:
SUGGESTION
Alter the two_phase true to false with the force_alter option enabled.
This command will succeed after aborting the prepared transaction.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-05-14 05:12:38 | Re: [PATCH] Fix bug when calling strncmp in check_authmethod_valid |
Previous Message | David Rowley | 2024-05-14 04:10:52 | Re: JIT compilation per plan node |