Re: Skip collecting decoded changes of already-aborted transactions

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Date: 2025-01-14 01:35:45
Message-ID: CAHut+PvCtpBrWBkNuqgu7QB+4ZRxLeY7WnZvFmCYHByzsWhfmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Sawada-San. Here are some cosmetic review comments for the patch v13-0001.

======
Commit message

1.
This commit introduces an additional CLOG lookup to check the
transaction status, so the logical decoding skips further change also
when it doesn't touch system catalogs if the transaction is already
aborted. This optimization enhances logical decoding performance,
especially for large transactions that have already been rolled back,
as it avoids unnecessary disk or network I/O.

~

That first sentence seems confusing. How about:

This commit adds a CLOG lookup to check the transaction status,
allowing logical decoding to skip changes for non-system catalogs if
the transaction is already aborted.

On Tue, Jan 14, 2025 at 5:56 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jan 6, 2025 at 5:52 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Sawada-San.
> >
> > Here are some review comments for the patch v12-0001.
>
> Thank you for reviewing the patch!
>
> >
> > ======
> > .../replication/logical/reorderbuffer.c
> >
> > ReorderBufferCheckAndTruncateAbortedTXN:
> >
> > ~~~
> >
> > 3.
> > + if (TransactionIdDidCommit(txn->xid))
> > + {
> > + /*
> > + * Remember the transaction is committed so that we can skip CLOG
> > + * check next time, avoiding the pressure on CLOG lookup.
> > + */
> > + Assert(!rbtxn_is_aborted(txn));
> > + txn->txn_flags |= RBTXN_IS_COMMITTED;
> > + return false;
> > + }
> > +
> > + /*
> > + * The transaction aborted. We discard the changes we've collected so far
> > + * and toast reconstruction data. The full cleanup will happen as part of
> > + * decoding ABORT record of this transaction.
> > + */
> > + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn));
> > + ReorderBufferToastReset(rb, txn);
> > +
> > + /* All changes should be discarded */
> > + Assert(txn->size == 0);
> > +
> > + /*
> > + * Mark the transaction as aborted so we can ignore future changes of this
> > + * transaction.
> > + */
> > + Assert(!rbtxn_is_committed(txn));
> > + txn->txn_flags |= RBTXN_IS_ABORTED;
> > +
> > + return true;
> > +}
> >
> > 3a.
> > That whole last part related to "The transaction aborted", might be
> > clearer if the whole chunk of code was in an 'else' block from the
> > previous "if (TransactionIdDidCommit(txn->xid))".
>
> I'm not sure it increases the readability. I think it pretty makes
> sense to me that we return false in the 'if
> (TransactionIdDidCommit(txn->xid))' block. If we add the 'else' block,
> the reader might be confused as we have the 'else' block in spite of
> having the return in the 'if' block. We can add a local variable for
> the result and return it at the end of the function but I'm not sure
> it's a good idea to increase the readability.
>

2.
I think adding a local variable is overkill but OTOH introducing
“else” clarifies that the following code can only be reached when the
transaction is aborted. E.g. You don’t even need to read the previous
code block and see the “return false” to know that. Anyway, it’s
probably just a personal preference.

> > 3c.
> > The "and toast reconstruction data" seems to be missing a word/s. (??)
> > - "... and also discard TOAST reconstruction data"
> > - "... and reset TOAST reconstruction data"
>
> I don't understand this comment. What words are you suggesting adding
> to these sentences?
>

3.
I meant something like:

BEFORE
We discard the changes we've collected so far and toast reconstruction data.

SUGGESTION
We discard both the changes collected so far and the TOAST reconstruction data.

======
src/include/replication/reorderbuffer.h

4.
-/* Has this transaction been prepared? */
+/*
+ * Is this transaction a prepared transaction?
+ *
+ * Being true means that this transaction should be prepared instead of
+ * committed. To check whether a prepare or a stream_prepare has already
+ * been sent for this transaction, we need to use rbtxn_sent_prepare().
+ */

/Is this transaction a prepared transaction?/Is this a prepared transaction?/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-01-14 01:43:27 Re: Conflict detection for update_deleted in logical replication
Previous Message Peter Smith 2025-01-14 01:07:15 Re: Question about behavior of deletes with REPLICA IDENTITY NOTHING