Re: Skip collecting decoded changes of already-aborted transactions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Date: 2025-01-14 20:02:36
Message-ID: CAD21AoAn8Q8NkkYSbd1JQT+0qPsrQcfXzJczDDbMUeUwCzKRag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 13, 2025 at 5:36 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Sawada-San. Here are some cosmetic review comments for the patch v13-0001.

Thank you for reviewing the patch.

>
> ======
> Commit message
>
> 1.
> This commit introduces an additional CLOG lookup to check the
> transaction status, so the logical decoding skips further change also
> when it doesn't touch system catalogs if the transaction is already
> aborted. This optimization enhances logical decoding performance,
> especially for large transactions that have already been rolled back,
> as it avoids unnecessary disk or network I/O.
>
> ~
>
> That first sentence seems confusing. How about:
>
> This commit adds a CLOG lookup to check the transaction status,
> allowing logical decoding to skip changes for non-system catalogs if
> the transaction is already aborted.

I'm concerned that the proposed sentence doesn't explain the change
enough. I think that what we need to mention in the commit message is
that we will have more opportunities to check the transaction aborts
in addition to when touching system catalogs while replaying a
transaction in streaming mode.

>
> On Tue, Jan 14, 2025 at 5:56 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, Jan 6, 2025 at 5:52 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > Hi Sawada-San.
> > >
> > > Here are some review comments for the patch v12-0001.
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > ======
> > > .../replication/logical/reorderbuffer.c
> > >
> > > ReorderBufferCheckAndTruncateAbortedTXN:
> > >
> > > ~~~
> > >
> > > 3.
> > > + if (TransactionIdDidCommit(txn->xid))
> > > + {
> > > + /*
> > > + * Remember the transaction is committed so that we can skip CLOG
> > > + * check next time, avoiding the pressure on CLOG lookup.
> > > + */
> > > + Assert(!rbtxn_is_aborted(txn));
> > > + txn->txn_flags |= RBTXN_IS_COMMITTED;
> > > + return false;
> > > + }
> > > +
> > > + /*
> > > + * The transaction aborted. We discard the changes we've collected so far
> > > + * and toast reconstruction data. The full cleanup will happen as part of
> > > + * decoding ABORT record of this transaction.
> > > + */
> > > + ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn));
> > > + ReorderBufferToastReset(rb, txn);
> > > +
> > > + /* All changes should be discarded */
> > > + Assert(txn->size == 0);
> > > +
> > > + /*
> > > + * Mark the transaction as aborted so we can ignore future changes of this
> > > + * transaction.
> > > + */
> > > + Assert(!rbtxn_is_committed(txn));
> > > + txn->txn_flags |= RBTXN_IS_ABORTED;
> > > +
> > > + return true;
> > > +}
> > >
> > > 3a.
> > > That whole last part related to "The transaction aborted", might be
> > > clearer if the whole chunk of code was in an 'else' block from the
> > > previous "if (TransactionIdDidCommit(txn->xid))".
> >
> > I'm not sure it increases the readability. I think it pretty makes
> > sense to me that we return false in the 'if
> > (TransactionIdDidCommit(txn->xid))' block. If we add the 'else' block,
> > the reader might be confused as we have the 'else' block in spite of
> > having the return in the 'if' block. We can add a local variable for
> > the result and return it at the end of the function but I'm not sure
> > it's a good idea to increase the readability.
> >
>
> 2.
> I think adding a local variable is overkill but OTOH introducing
> “else” clarifies that the following code can only be reached when the
> transaction is aborted. E.g. You don’t even need to read the previous
> code block and see the “return false” to know that. Anyway, it’s
> probably just a personal preference.

I prefer to reduce blocks where possible.

>
> > > 3c.
> > > The "and toast reconstruction data" seems to be missing a word/s. (??)
> > > - "... and also discard TOAST reconstruction data"
> > > - "... and reset TOAST reconstruction data"
> >
> > I don't understand this comment. What words are you suggesting adding
> > to these sentences?
> >
>
> 3.
> I meant something like:
>
> BEFORE
> We discard the changes we've collected so far and toast reconstruction data.
>
> SUGGESTION
> We discard both the changes collected so far and the TOAST reconstruction data.
>

Thanks, fixed.

> ======
> src/include/replication/reorderbuffer.h
>
> 4.
> -/* Has this transaction been prepared? */
> +/*
> + * Is this transaction a prepared transaction?
> + *
> + * Being true means that this transaction should be prepared instead of
> + * committed. To check whether a prepare or a stream_prepare has already
> + * been sent for this transaction, we need to use rbtxn_sent_prepare().
> + */
>
> /Is this transaction a prepared transaction?/Is this a prepared transaction?/
>

Fixed.

I've attached the updated patch (only 0001 patch). I'll submit the
updated patch for 0002 patch once we get consensus on names.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v14-0001-Skip-logical-decoding-of-already-aborted-transac.patch application/octet-stream 21.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2025-01-14 20:35:17 Re: New GUC autovacuum_max_threshold ?
Previous Message David E. Wheeler 2025-01-14 20:01:30 Re: RFC: Additional Directory for Extensions