Re: [HACKERS] logical decoding of two-phase transactions

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2021-07-27 06:09:51
Message-ID: CAHut+Pts_bWx_RrXu+YwbiJva33nTROoQQP5H4pVrF+NcCMkRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 23, 2021 at 8:08 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jul 20, 2021 at 9:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Please find attached the latest patch set v98*
> >
>
> Review comments:
> ================

All the following review comments are addressed in v99* patch set.

> 1.
> /*
> - * Handle STREAM COMMIT message.
> + * Common spoolfile processing.
> + * Returns how many changes were applied.
> */
> -static void
> -apply_handle_stream_commit(StringInfo s)
> +static int
> +apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
>
> Let's extract this common functionality (common to current code and
> the patch) as a separate patch? I think we can commit this as a
> separate patch.
>

Done. Split patches as requested.

> 2.
> apply_spooled_messages()
> {
> ..
> elog(DEBUG1, "replayed %d (all) changes from file \"%s\"",
> nchanges, path);
> ..
> }
>
> You have this DEBUG1 message in apply_spooled_messages and its
> callers. You can remove it from callers as the patch already has
> another debug message to indicate whether it is stream prepare or
> stream commit. Also, if this is the only reason to return nchanges
> from apply_spooled_messages() then we can get rid of that as well.
>

Done.

> 3.
> + /*
> + * 2. Mark the transaction as prepared. - Similar code as for
> + * apply_handle_prepare (i.e. two-phase non-streamed prepare)
> + */
> +
> + /*
> + * BeginTransactionBlock is necessary to balance the EndTransactionBlock
> + * called within the PrepareTransactionBlock below.
> + */
> + BeginTransactionBlock();
> + CommitTransactionCommand(); /* Completes the preceding Begin command. */
> +
> + /*
> + * Update origin state so we can restart streaming from correct position
> + * in case of crash.
> + */
> + replorigin_session_origin_lsn = prepare_data.end_lsn;
> + replorigin_session_origin_timestamp = prepare_data.prepare_time;
> +
> + PrepareTransactionBlock(gid);
>
> I think you can move this part into a common function
> apply_handle_prepare_internal. If that is possible then you might want
> to move this part into a common functionality patch as mentioned in
> point-1.
>

Done. (The common function is included in patch 0001)

> 4.
> + xid = logicalrep_read_stream_prepare(s, &prepare_data);
> + elog(DEBUG1, "received prepare for streamed transaction %u", xid);
>
> It is better to have an empty line between the above code lines for
> the sake of clarity.
>

Done.

> 5.
> +/* Commit (and abort) information */
> typedef struct LogicalRepCommitData
>
> How is this structure related to abort? Even if it is, why this
> comment belongs to this patch?
>

OK. Removed this from the patch.

> 6. Most of the code in logicalrep_write_stream_prepare() and
> logicalrep_write_prepare() is same except for message. I think if we
> want we can handle both of them with a single message by setting some
> flag for stream case but probably there will be some additional
> checking required on the worker-side. What do you think? I think if we
> want to keep them separate then at least we should keep the common
> functionality in logicalrep_write_*/logicalrep_read_* in separate
> functions. This way we will avoid minor inconsistencies in-stream and
> non-stream functions.
>

Done. (The common functions are included in patch 0001).

> 7.
> +++ b/doc/src/sgml/protocol.sgml
> @@ -2881,7 +2881,7 @@ The commands accepted in replication mode are:
> Begin Prepare and Prepare messages belong to the same transaction.
> It also sends changes of large in-progress transactions between a pair of
> Stream Start and Stream Stop messages. The last stream of such a transaction
> - contains a Stream Commit or Stream Abort message.
> + contains a Stream Prepare, Stream Commit or Stream Abort message.
>
> I am not sure if it is correct to mention Stream Prepare here because
> after that we will send commit prepared as well for such a
> transaction. So, I think we should remove this change.
>

Done.

> 8.
> -ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
> -
> \dRs+
>
> +ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
>
> Is there a reason for this change in the tests?
>

Yes, the setting of slot_name = NONE really belongs with the DROP
SUBSCRIPTION. Similarly, the \dRs+ is done to test the effect of the
setting of the streaming option (not the slot_name = NONE). Since I
needed to add a new DROP SUBSCRIPTION (because now the streaming
option works) so I also refactored this exiting test to make all the
test formats consistent.

> 9.
> I think this contains a lot of streaming tests in 023_twophase_stream.
> Let's keep just one test for crash-restart scenario (+# Check that 2PC
> COMMIT PREPARED is decoded properly on crash restart.) where both
> publisher and subscriber get restarted. I think others are covered in
> one or another way by other existing tests. Apart from that, I also
> don't see the need for the below tests:
> # Do DELETE after PREPARE but before COMMIT PREPARED.
> This is mostly the same as the previous test where the patch is testing Insert
> # Try 2PC transaction works using an empty GID literal
> This is covered in 021_twophase.
>

Done. Removed all the excessive tests as you suggested.

> 10.
> +++ b/src/test/subscription/t/024_twophase_cascade_stream.pl
> @@ -0,0 +1,271 @@
> +
> +# Copyright (c) 2021, PostgreSQL Global Development Group
> +
> +# Test cascading logical replication of 2PC.
>
> In the above comment, you might want to say something about streaming.
> In general, I am not sure if it is really adding value to have these
> many streaming tests for cascaded setup and doing the whole setup
> again after we have done in tests 022_twophase_cascade. I think it is
> sufficient to do just one or two streaming tests by enhancing
> 022_twophase_cascade, you can alter subscription to enable streaming
> after doing non-streaming tests.
>

Done. Remove the 024 TAP tests, and instead merged the streaming
cascade tests into the 022_twophase_casecase.pl as you suggested.

> 11. Have you verified that all these tests went through the streaming
> code path? If not, you can once enable DEBUG message in
> apply_handle_stream_prepare() and see if all tests hit that.
>

Yeah, it was done a very long time ago when the tests were first
written; Anyway, just to be certain I temporarily modified the code as
suggested and confirmed by the logfiles that the tests is running
through apply_handle_stream_prepare.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-07-27 06:11:41 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Yura Sokolov 2021-07-27 06:06:24 Re: [PoC] Improve dead tuple storage for lazy vacuum