Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: exclusion(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")
Date: 2024-06-16 20:34:07
Message-ID: 3030961.1718570047@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> In any case, it feels like a bad idea that initSpGistState doesn't
> always make myXid valid: that seems like it's assuming too much
> about when we can create REDIRECT tuples.
> What seems like it should work is
> state->myXid = GetTopTransactionIdIfAny();
> + if (!TransactionIdIsValid(state->myXid))
> + state->myXid = ReadNextTransactionId();
> on the grounds that if we lack an XID, then the next-to-be-assigned
> XID should be a safe expiry horizon for any REDIRECT we might need.

I spent a good deal more time thinking about this, and concluded that
that idea doesn't help. The only case where this is problematic is
redirect tuples inserted during REINDEX CONCURRENTLY. As far as I can
see, using ReadNextTransactionId adds no safety because that XID could
be consumed and then fall below the global xmin horizon before REINDEX
finishes. (If REINDEX CONCURRENTLY held onto a single snapshot for
the duration, that snapshot's xmin would prevent this; but it doesn't,
so there could be times when there's no snapshot anywhere.)

Nonetheless, it seems like it'd mostly be okay to insert invalid XID
in redirects generated by REINDEX CONCURRENTLY. The reason that that
seems safe is that R.C. doesn't allow any scans to use the index at
all till it's done; therefore there could never be a scan "in flight"
to such a redirect. We do allow insertions to happen concurrently
with R.C., but those should never be in flight to a redirect at all,
as explained in src/backend/access/spgist/README. And furthermore,
R.C. also locks out VACUUM. This means that it should be safe for
VACUUM to clean up a redirect inserted by R.C. immediately when it
sees one.

The one problem is that VACUUM's test

GlobalVisTestIsRemovableXid(vistest, dt->xid))

would fail on invalid xid, mainly because FullXidRelativeTo doesn't
work on that. (It'd get an assertion failure, or produce the wrong
value without assertions.) We can fix that by adding an explicit
test for invalid xid, which we really need to do anyway because of
the possibility that existing indexes contain such redirects.

So I conclude that we basically just need to remove the failing
assertion in spgFormDeadTuple and instead add a guard for invalid
xid in vacuumRedirectAndPlaceholder. The attached draft patch
also adds some comments and performs the threatened rename of
myXid to redirectXid. (I wouldn't do that renaming in back
branches, only HEAD.)

Thoughts?

regards, tom lane

Attachment Content-Type Size
fix-spgist-redirect-in-reindex-concurrently.patch text/x-diff 5.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-06-16 22:52:52 Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)")
Previous Message PG Bug reporting form 2024-06-16 20:18:32 BUG #18512: Backend memory leak when using command parameters after generating notifications