Re: Skip collecting decoded changes of already-aborted transactions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Date: 2024-11-11 18:00:06
Message-ID: CAD21AoDWsWS-1c9BEt2mXw480+zaWqr+36AxhB_LuVZk5N1cFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 10, 2024 at 11:24 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Sawada-San, here are some review comments for the patch v5-0001.
>

Thank you for reviewing the patch!

> ======
> Commit message.
>
> 1.
> This commit introduces an additional check to determine if a
> transaction is already aborted by a CLOG lookup, so the logical
> decoding skips further change also when it doesn't touch system
> catalogs.
>
> ~
>
> Is that wording backwards? Is it meant to say:
>
> This commit introduces an additional CLOG lookup check to determine if
> a transaction is already aborted, so the ...

Fixed.

>
> ======
> contrib/test_decoding/sql/stats.sql
>
> 2
> +SELECT slot_name, spill_txns = 0 AS spill_txn, spill_count = 0 AS
> spill_count FROM pg_stat_replication_slots WHERE slot_name =
> 'regression_slot_stats4_twophase';
>
> Why do the SELECT "= 0" like this, instead of just having zeros in the
> "expected" results?

Indeed. I used "=0" like other queries in the same file do, but it
makes sense to me just to have zeros in the expected file. That way,
it would make it a bit easier to investigate in case of failures.

>
> ======
> .../replication/logical/reorderbuffer.c
>
> 3.
> static void ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
> - bool txn_prepared);
> + bool txn_prepared, bool mark_streamed);
>
> That last parameter name ('mark_streamed') does not match the same
> parameter name in this function's definition.

Fixed.

>
> ~~~
>
> ReorderBufferTruncateTXN:
>
> 4.
> if (txn_streaming && (!txn_prepared) &&
> (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0)))
> txn->txn_flags |= RBTXN_IS_STREAMED;
>
> if (txn_prepared)
> {
> ~
>
> Since the following condition was already "if (txn_prepared)" would it
> be better remove the "(!txn_prepared)" here and instead just refactor
> the code like:
>
> if (txn_prepared)
> {
> ...
> }
> else if (txn_streaming && (rbtxn_is_toptxn(txn) || (txn->nentries_mem != 0)))
> {
> ...
> }

Good idea.

>
> ~~~
>
> ReorderBufferProcessTXN:
>
> 5.
> +
> + /* Remember the transaction is aborted */
> + Assert((curtxn->txn_flags & RBTXN_IS_COMMITTED) == 0);
> + curtxn->txn_flags |= RBTXN_IS_ABORTED;
>
> Missing period on comment.

Fixed.

>
> ~~~
>
> ReorderBufferCheckTXNAbort:
>
> 6.
> + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't
> + * check the transaction status, so the caller always processes this
> + * transaction. This is to disable this check for regression tests.
> + */
> +static bool
> +ReorderBufferCheckTXNAbort(ReorderBuffer *rb, ReorderBufferTXN *txn)
> +{
> + /*
> + * If GUC 'debug_logical_replication_streaming' is "immediate", we don't
> + * check the transaction status, so the caller always processes this
> + * transaction.
> + */
> + if (unlikely(debug_logical_replication_streaming ==
> DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE))
> + return false;
> +
>
> The wording of the sentence "This is to disable..." seemed a bit
> confusing. Maybe this area can be simplified by doing the following.
>
> 6a.
> Change the function comment to say more like below:
>
> When the GUC 'debug_logical_replication_streaming' is set to
> "immediate", we don't check the transaction status, meaning the caller
> will always process this transaction. This mode is used by regression
> tests to avoid unnecessary transaction status checking.
>
> ~
>
> 6b.
> It is not necessary for this 2nd comment to repeat everything that was
> already said in the function comment. A simpler comment here might be
> all you need:
>
> SUGGESTION:
> Quick return for regression tests.

Agreed with the above two comments. Fixed.

>
> ~~~
>
> 7.
> Is it worth mentioning about this skipping of the transaction status
> check in the docs for this GUC? [1]

If we want to mention this optimization in the docs, we have to
explain how the optimization works too. I think it's too detailed.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v6-0001-Skip-logical-decoding-of-already-aborted-transact.patch application/octet-stream 18.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-11-11 18:03:07 Re: index prefetching
Previous Message Ivan Kush 2024-11-11 17:29:14 Re: Replace IN VALUES with ANY in WHERE clauses during optimization