Re: BUG #18369: logical decoding core on AssertTXNLsnOrder()

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: ocean_li_996 <ocean_li_996(at)163(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18369: logical decoding core on AssertTXNLsnOrder()
Date: 2024-07-12 02:30:18
Message-ID: CAD21AoB5HQAUCWEzTr8zK=1AnU6-2xRFyAx3TZ_ymrVC6du2hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, May 16, 2024 at 4:23 PM ocean_li_996 <ocean_li_996(at)163(dot)com> wrote:
>
> Hi all:
>
>
> Two months have passed by now, and I'm wondering if the issue mentioned in [1] has been resolved yet?

I've pushed the fix, bb19b70081.

>
> Returning to this question, does anyone have any new opinions on it? If not, should we consider fixing this problem?
>
> Or is this considered a bug that's not worth fixing?

I think we should fix this bug too. I've reviewed the proposed v4
patch for pg15 and here are comments:

---
+# DML operations on temporary table temp_t consume XIDs, but do not generate
+# WAL records. The last decoding restarts from the first checkpoint and doesn't
+# get any information of last two substransactions that performed
s0_insert_temp.
+# While processing the commit record for the corresponding top-level
+# transaction which will be marked as containing catalog change even before
+# commit, we ensure that the corresponding substransaction is added into
+# ReorderBuffer as subxact.
+permutation "s0_create_temp" "s0_init" "s0_begin" "s0_txid_current"
"s1_checkpoint" "s0_savepoint" "s0_insert_temp" "s0_savepoint_release"
"s0_savepoint" "s0_insert_temp" "s0_savepoint_release" "s0_savepoint"
"s0_insert_temp" "s0_savepoint_release" "s0_create_part1"
"s1_get_changes" "s0_commit" "s1_get_changes"

I think the new test case can be simplified to something like:

permutation "s0_create_temp" "s0_init" "s0_begin" "s0_txid_current"
"s1_checkpoint" "s0_truncate" "s0_savepoint" "s0_insert_temp"
"s0_savepoint" "s0_insert_temp" "s0_savepoint" "s0_insert_temp"
"s1_get_changes" "s0_commit" "s1_get_changes"

It has fewer steps and emits fewer WAL records.

---
+ *
+ * Note: There is a possibility that the
transaction was created
+ * as not the top-level txn, but the flag was
not set. In this
+ * case, the transaction was not added to the
list. This happens if
+ * sub-txns are first recognized and decoded
by a COMMIT record.
*/
- dlist_delete(&subtxn->node);
+ if (subtxn->node.next != NULL)
+ dlist_delete(&subtxn->node);
}

I think that the new if statement can be merged to the above else
block (i.e, "else if (subtx->node.next != NULL)").

It's not clear to me what the "the flag" in the new comment refers to.
How about rephrasing it to:

/*
* We already saw this transaction, but might have initially added
* it to the list of top-level txns. Now that we know it's not
* top-level, remove it from there if it's already attached to the
* list.
*/

---
/*
- * Remove TXN from its containing list.
+ * Remove TXN from its containing list if registered.
*

How about "if attached"? On HEAD, we use the terms "detached" for
dlist_node (see dlist_node_is_detached() in ilist.h).

---
+void
+ReorderBufferXidSetCatalogChangesEx(ReorderBuffer *rb, TransactionId xid,
+ XLogRecPtr lsn, bool is_top)
{
ReorderBufferTXN *txn;

- txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
+ txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, is_top);

and

- ReorderBufferXidSetCatalogChanges(builder->reorder,
subxacts[i], lsn);
+ ReorderBufferXidSetCatalogChangesEx(builder->reorder,
subxacts[i], lsn, false);

As Kuroda-san previously mentioned[1], with this change, we would end
up with a transaction entry that is not the top-level transaction but
is not assigned to any top-level transaction. Such transactions are
created in SnapBuildXidSetCatalogChanges(), used solely for reference
by SnapBuildCommitTxn(), and promptly cleaned up afterward. Therefore
I don't believe it would pose a significant issue, but I will
contemplate it further. I feel it's somewhat of a hacky solution just
to avoid AssertTXNLsnOrder().

In SnapBuildXidSetCatalogChanges(), since we know that all XIDs in the
subxacts array are subtransactions of the transaction 'xid', another
way to fix it would be to properly assign all subtransactions to the
top-level transaction (i.e., by calling ReorderBufferAssignChild())
before setting each of them as catalog-changes. However, this change
alone is insufficient, as it has been possible that we accumulate
invalidation messages into a transaction that will later be identified
as a subtransaction. IOW, another assertion in ReorderBufferForget()
would fail:

/*
* Process cache invalidation messages if there are any. Even if we're not
* interested in the transaction's contents, it could have manipulated the
* catalog and we need to update the caches according to that.
*/
if (txn->base_snapshot != NULL && txn->ninvalidations > 0)
ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
txn->invalidations);
else
Assert(txn->ninvalidations == 0);

Note that when assigning a subtransaction to its top-level
transaction, we transfer the subtransaction's snapshot to its
top-level transaction but not for the invalidation messages in
ReorderBufferTXN. Consequently, it's possible to have subtransactions
that lack the base snapshot but possess invalidation messages.
Therefore, we would need to check if txn->ninvalidations == 0 only for
top-level transactions. That is:

- else
+ else if (!rbtxn_is_known_subxact(txn))
Assert(txn->ninvalidations == 0);

I'm uncertain which approach is better or if there exists an even
superior approach. I'll continue contemplating it. Feedback is highly
welcomed.

Regards,

[1] https://www.postgresql.org/message-id/TYCPR01MB1207790E98F0A563280CC39FCF5262%40TYCPR01MB12077.jpnprd01.prod.outlook.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2024-07-12 02:52:20 Re: Potential data loss due to race condition during logical replication slot creation
Previous Message Michael Paquier 2024-07-12 01:20:56 Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae