Re: Skip collecting decoded changes of already-aborted transactions

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Date: 2024-11-19 07:12:44
Message-ID: CALDaNm3COogB2_aPgFZ_pNBh7BjmwDqoBj5bE4H7D26F3hJQ=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 15 Nov 2024 at 23:32, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Nov 14, 2024 at 7:07 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Sawada-Sn,
> >
> > Here are some review comments for patch v8-0001.
>
> Thank you for the comments.
>
> >
> > ======
> > contrib/test_decoding/sql/stats.sql
> >
> > 1.
> > +-- The INSERT changes are large enough to be spilled but not, because the
> > +-- transaction is aborted. The logical decoding skips collecting further
> > +-- changes too. The transaction is prepared to make sure the decoding processes
> > +-- the aborted transaction.
> >
> > /to be spilled but not/to be spilled but will not be/
>
> Fixed.
>
> >
> > ======
> > .../replication/logical/reorderbuffer.c
> >
> > ReorderBufferTruncateTXN:
> >
> > 2.
> > /*
> > * Discard changes from a transaction (and subtransactions), either after
> > - * streaming or decoding them at PREPARE. Keep the remaining info -
> > - * transactions, tuplecids, invalidations and snapshots.
> > + * streaming, decoding them at PREPARE, or detecting the transaction abort.
> > + * Keep the remaining info - transactions, tuplecids, invalidations and
> > + * snapshots.
> > *
> > * We additionally remove tuplecids after decoding the transaction at prepare
> > * time as we only need to perform invalidation at rollback or commit prepared.
> > *
> > + * The given transaction is marked as streamed if appropriate and the caller
> > + * asked it by passing 'mark_txn_streaming' being true.
> > + *
> > * 'txn_prepared' indicates that we have decoded the transaction at prepare
> > * time.
> > */
> > static void
> > -ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
> > bool txn_prepared)
> > +ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
> > bool txn_prepared,
> > + bool mark_txn_streaming)
> >
> > I think the function comment should describe the parameters in the
> > same order that they appear in the function signature.
>
> Not sure it should be. We sometimes describe the overall idea of the
> function first while using arguments names, and then describe what
> other arguments mean.
>
> >
> > ~~~
> >
> > 3.
> > + else if (mark_txn_streaming && (rbtxn_is_toptxn(txn) ||
> > (txn->nentries_mem != 0)))
> > + {
> > ...
> > + txn->txn_flags |= RBTXN_IS_STREAMED;
> > + }
> >
> > I guess it doesn't matter much, but for the sake of readability,
> > should the condition also be checking !rbtxn_is_streamed(txn) to avoid
> > overwriting the RBTXN_IS_STREAMED bit when it was set already?
>
> Not sure it improves readability because it adds one more check there.
> If it's important not to re-set RBTXN_IS_STREAMED, it makes sense to
> have that check and describe in the comment. But in this case, I think
> we don't necessarily need to do that.
>
> > ~~~
> >
> > ReorderBufferTruncateTXNIfAborted:
> >
> > 4.
> > + /*
> > + * The transaction aborted. We discard the changes we've collected so far,
> > + * and free all resources allocated for toast reconstruction. The full
> > + * cleanup will happen as part of decoding ABORT record of this
> > + * transaction.
> > + *
> > + * Since we don't check the transaction status while replaying the
> > + * transaction, we don't need to reset toast reconstruction data here.
> > + */
> > + ReorderBufferTruncateTXN(rb, txn, false, false);
> >
> > 4a.
> > The first part of the comment says "... and free all resources
> > allocated for toast reconstruction", but the second part says "we
> > don't need to reset toast reconstruction data here". Is that a
> > contradiction?
>
> Yes, the comment is out-of-date. Since this function is not called
> while replaying the transaction, it should not have any toast
> reconstruction data.
>
> >
> > ~
> >
> > 4b.
> > Shouldn't this call still be passing rbtxn_prepared(txn) as the 2nd
> > last param, like it used to?
>
> Actually it's not necessary because it should always be false. But
> thinking more, it seems to be better to use rbtxn_preapred(txn) since
> it's consistent with other places and it's not necessary to put
> assumptions there.

Few comments:
1) Should we have the Assert inside ReorderBufferTruncateTXNIfAborted
instead of having it at multiple callers, ReorderBufferResetTXN also
has the Assert inside the function after truncate of the transaction:
@@ -3672,6 +3758,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
Assert(txn->total_size > 0);
Assert(rb->size >= txn->total_size);

+ /* skip the transaction if aborted */
+ if (ReorderBufferTruncateTXNIfAborted(rb, txn))
+ {
+ /* All changes should be discarded */
+ Assert(txn->size == 0 && txn->total_size == 0);
+ continue;
+ }
+
ReorderBufferStreamTXN(rb, txn);
}
else
@@ -3687,6 +3781,14 @@ ReorderBufferCheckMemoryLimit(ReorderBuffer *rb)
Assert(txn->size > 0);
Assert(rb->size >= txn->size);

+ /* skip the transaction if aborted */
+ if (ReorderBufferTruncateTXNIfAborted(rb, txn))
+ {
+ /* All changes should be discarded */
+ Assert(txn->size == 0 && txn->total_size == 0);
+ continue;
+ }

2) txn->txn_flags can be moved to the next line to keep it within 80
chars in this case:
* Check the transaction status by looking CLOG and discard all changes if
* the transaction is aborted. The transaction status is cached in
txn->txn_flags
* so we can skip future changes and avoid CLOG lookups on the next call. Return

3) Is there any scenario where the Assert can fail as the toast is not reset:
+ * Since we don't check the transaction status while replaying the
+ * transaction, we don't need to reset toast reconstruction data here.
+ */
+ ReorderBufferTruncateTXN(rb, txn, rbtxn_prepared(txn), false);

+ if (ReorderBufferTruncateTXNIfAborted(rb, txn))
+ {
+ /* All changes should be discarded */
+ Assert(txn->size == 0 && txn->total_size == 0);
+ continue;
+ }

4) This can be changed to a single line comment:
+ /*
+ * Quick return if the transaction status is already known.
+ */
+ if (rbtxn_is_committed(txn))
+ return false;

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nisha Moond 2024-11-19 07:12:55 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Shlok Kyal 2024-11-19 07:11:46 Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY