Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date: 2021-07-30 02:25:48
Message-ID: 20210730022548.GA1940096@gust.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jul 29, 2021 at 07:21:44PM +0500, Andrey Borodin wrote:
> Currently CIC test breaks indexes even without 2PC. How is it supposed to work if vxid stays vxid through GetLockConflicts()\WaitForLockersMultiple() barrier and then suddenly converts to xid and commits before index validated?

It's an inval race condition. Here's the sequence of events leading to a
failure. "i:" lines are events within the INSERT backend, and "r:" lines are
events within the REINDEX CONCURRENTLY backend:

r: Phase 2 begins.
i: Start INSERT.
i: Acquire RowExclusiveLock on "idx_ccnew".
i: RelationBuildDesc() begins for relation "idx_ccnew".
i: RelationBuildDesc() calls RelationInitIndexAccessInfo(), which loads
indisready=f.
r: Phase 2 commits indisready=t for idx_ccnew.
r: Start waiting for the INSERT to finish.
i: LocalExecuteInvalidationMessage() calls RelationCacheInvalidateEntry() for
the OID of relation "idx_ccnew". Its RelationIdCacheLookup() call returns
NULL, because nothing has passed the relation to RelationCacheInsert().
i: RelationBuildDesc(), the same call that began above, calls
RelationCacheInsert(). This is a problem, because the entry needs
invalidation, but we already consumed the invalidation messages.
i: Inserts heap tuple with xmin=xid1. Does not insert to idx_ccnew. Commits.
r: validate_index() runs, adding the index tuple for xid1.
i: Second INSERT runs, adding a heap tuple with xmin=xid2. It uses the stale
relcache entry having indisready=f, so this insert also does not modify
idx_ccnew. This time, it was responsible for making that index entry;
nothing else will.

In summary, the relcache is deaf to invals of a given relation during
RelationBuildDesc() of that relation. One fix would be to make
RelationBuildDesc() more like this pseudocode:

retry:
WatchInvalFor(targetRelId);
... complicated build steps that may call AcceptInvalidationMessages() ...
if (ReceivedInvalFor(targetRelId))
goto retry;

I'm attaching a proof-of-concept sufficient to let your v7 pgbench test pass.
Among other things, I haven't checked whether other inval message types have
the same hazard. What alternative fix designs should we consider?

Attachment Content-Type Size
inval-build-race-v0.patch text/x-diff 1.7 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2021-07-30 05:48:25 Re: BUG #17116: Assert failed in SerialSetActiveSerXmin() on commit of parallelized serializable transaction
Previous Message PG Bug reporting form 2021-07-29 22:35:01 BUG #17128: minimum numeric 'integer' is -2147483647 not -2147483648 as documented