Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.
Date: 2024-12-10 02:34:47
Message-ID: CAHut+PvjZBD9dqbp16k23zFC5Y6pTiDVnKrDTgafs4B1i2xJCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Shubham,

Here are some review comments for the patch v1-0001.

Note -- I think Kuroda-san's idea to use a new switch like
'--enable-two-phase' would eliminate lots of my review comments below
about the inconsistencies with CREATE SUBSCRIBER, and the current
confusion about defaults and argument values etc.

======
Commit message

1.
By default, two-phase commit is enabled if the option is provided without
arguments. Users can explicitly set it to 'on' or 'off' using '--two-phase=on'
or '--two-phase=off'.

~

But you don't say what is the default if the option is omitted entirely.

~~~

2.
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two-phase = on)', is supported.
This provides flexibility for future enhancements.

~

Typo in ALTER example with the option name /two-phase/two_phase/

~~~

3.
Documentation has been updated to reflect the new option, and test cases have
been added to validate various scenarios, including proper validation of the
'--two-phase' flag and its combinations with other options.

/flag/option/ ??

======
doc/src/sgml/ref/pg_createsubscriber.sgml

4.
+ <varlistentry>
+ <term><option>-T</option></term>
+ <term><option>--two_phase</option></term>
+ <listitem>
+ <para>
+ Enables or disables two-phase commit for the subscription.
+ When the option is provided without a value, it defaults to
+ <literal>on</literal>. Specify <literal>on</literal> to enable or
+ <literal>off</literal> to disable.
+ Two-phase commit ensures atomicity in logical replication for prepared
+ transactions. By default, this option is enabled unless explicitly set
+ to <literal>off</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+

4a.
Typo -- the option you made is 'two-phase', not 'two_phase'

~

4b.
When you say "By default, this option is enabled unless explicitly set
to off." I take that as meaning it is default *enabled* even when the
option is entirely omitted.

But, that would be contrary to the behaviour of a normal CREATE
SUBSCRIPTION 'two_phase' option, so I wonder why should
pg_createsubscriber have a different default. Is this intentional? IMO
if no switch is specified then I thought it should be the same as the
CREATE SUBSCRIPTION default (i.e. false).

~

4c.
This "-T" option should be moved to be adjacent (below) "-t" to keep
everything consistently in A-Z order.

======
src/bin/pg_basebackup/pg_createsubscriber.c

5.
bool made_replslot; /* replication slot was created */
bool made_publication; /* publication was created */
+ bool two_phase; /* two-phase option was created */

I am not sure what "option was created" even means. Cut/paste error?

Also, perhaps this field belongs with the 1st group of fields in this
struct, instead of with those made_xxx fields.

~~~

store_pub_sub_info:

6.
+ /* Store two-phase option */
+ dbinfo[i].two_phase = opt->two_phase;
+

The comment says the same as what the code is doing which seems
unhelpful/redundant. And if you rearrange the struct fields as
previously suggested, this assignment should also move.

~~~

7.
pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i,
dbinfo[i].subname ? dbinfo[i].subname : "(auto)",
dbinfo[i].subconninfo);
+ pg_log_debug("publisher(%d): two-phase: %s", i,
+ dbinfo[i].two_phase ? "true" : "false");

"two_phase" is a subscription option. So shouldn't this added
pg_log_debug info be part of the preceding pg_log_debug about the
subscription?

~~~

main:

8.
{"recovery-timeout", required_argument, NULL, 't'},
+ {"two_phase", optional_argument, NULL, 'T'},
{"subscriber-username", required_argument, NULL, 'U'},

AFAIK this option was supposed to be 'two-phase' (dash instead of
underscore is consistent with the other pg_createsubscriber options)

~~~

9.
opt.sub_username = NULL;
+ opt.two_phase = true;

As previously mentioned this default is not the same as the CREATE
SUBSCRIPTION default for 'two_phase', and I find that inconsistency to
be confusing.

~~~

10.
+ case 'T':
+ if (optarg != NULL)
+ {
+ if (strcmp(optarg, "on") == 0)
+ opt.two_phase = true;
+ else if (strcmp(optarg, "off") == 0)
+ opt.two_phase = false;
+ else
+ pg_fatal("invalid value for --two-phase: must be 'on' or 'off'");
+ }
+ else
+ opt.two_phase = true; /* Default to true if no argument
+ * is provided */
+ break;

I wonder if users familiar with the CREATE SUBSCRIPTION 'two_phase'
might find it strange that the only values accepted here are 'on' and
'off' but now all the other boolean variants.

======
.../t/040_pg_createsubscriber.pl

11.
+# Run pg_createsubscriber on a promoted server with two_phase=on
+command_ok(
+ [
+ 'pg_createsubscriber', '--verbose',
+ '--recovery-timeout', "$PostgreSQL::Test::Utils::timeout_default",
+ '--pgdata', $node_b->data_dir,
+ '--publisher-server', $node_a->connstr($db10),
+ '--subscriber-port', $node_b->port,
+ '--database', $db10,
+ '--two_phase=on'
+ ],
+ 'created subscription with two-phase commit enabled');

Hmm. I expect this should have been specified as '--two-phase=on'
(dash instead of underscore for consistency with all other switches of
pg_createsubscriber) but previous typos (e.g. #8 above) have
compounded the confusion.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-12-10 02:52:40 Re: Memory leak in WAL sender with pgoutput (v10~)
Previous Message Tom Lane 2024-12-10 02:18:48 Re: Fix some comments for GUC hooks of timezone_abbreviations