XACT_EVENT_COMMIT placement

From: Yurii Rashkovskii <yrashk(at)omnigres(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: XACT_EVENT_COMMIT placement
Date: 2025-04-25 14:57:00
Message-ID: CAG=VW14zYHC4joE+QtjfOVewHi4B-K_LZq-P6CvJofgkrqJsNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi there,

There is an issue with the existing placement of `XACT_EVENT_COMMIT` in
`CommitTransaction()`. The place where it is called now (and has been
called for many versions before) is a bit too early – it occurs at a time
when other backends might not be able to see what this backend has
committed yet.

I discovered this when I used the `XACT_EVENT_COMMIT` to start a background
worker upon creation of an extension (as part of the installation script).
The symptom was that sometimes the background worker would see the [DDL]
proceeds of the transaction, and sometimes it would not. The case extends
beyond extensions, of course – any such notification to other backends or
external systems using those backends is subject to it.

If you look where CallXactCallbacks is called [1], you can scroll down and
see the following comment: "Make catalog changes visible to all backends."
[2] which was my clue as to what was happening.

I've solved this with a hack. I've observed that `TopTransactionContext` is
being reset even further down the road [3], so I used a memory context
callback to slingshot [4] my actual callback (in the original case, the one
starting a background worker) there. It has worked remarkably well for
quite some time. But it's a hack nevertheless.

Bottom line, I'd like to discuss further course of action and options:

(a) Do we move CallXactCallbacks [1] after the invalidation [2]? Could this
negatively impact existing implementations?
(b) Do we add a new event `XACT_EVENT_POST_COMMIT` and call it after [2] or
even after memory context reset [3]? Are there many implementations of this
callback that don't switch over the event type correctly, and would suffer
from the introduction of the new event?

Personally, out of caution, I currently favor option (b), but I'd greatly
appreciate your thoughts and insights.

[1]
https://github.com/postgres/postgres/blob/0787646e1dce966395f211fb9475dcab32daae70/src/backend/access/transam/xact.c#L2407-L2408
[2]
https://github.com/postgres/postgres/blob/0787646e1dce966395f211fb9475dcab32daae70/src/backend/access/transam/xact.c#L2426-L2433
[3]
https://github.com/postgres/postgres/blob/0787646e1dce966395f211fb9475dcab32daae70/src/backend/access/transam/xact.c#L2486
[4] https://github.com/pgedc/book/blob/master/memcxt_slingshot.md

--
Founder at Omnigres
https://omnigres.com

Browse pgsql-hackers by date

  From Date Subject
Next Message wenhui qiu 2025-04-25 15:04:12 Re: Introduce some randomness to autovacuum
Previous Message Tom Lane 2025-04-25 14:26:09 Re: Allow io_combine_limit up to 1MB