On Tue, Feb 08, 2022 at 06:04:03PM -0800, Noah Misch wrote:
> On Tue, Feb 08, 2022 at 04:43:47PM -0800, Andres Freund wrote:
> > On 2022-02-08 22:13:01 +0100, Tomas Vondra wrote:
> > > On 10/24/21 03:40, Noah Misch wrote:
> > > > Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY.
> > > >
> > > > CIC and REINDEX CONCURRENTLY assume backends see their catalog changes
> > > > no later than each backend's next transaction start. That failed to
> > > > hold when a backend absorbed a relevant invalidation in the middle of
> > > > running RelationBuildDesc() on the CIC index. Queries that use the
> > > > resulting index can silently fail to find rows. Fix this for future
> > > > index builds by making RelationBuildDesc() loop until it finishes
> > > > without accepting a relevant invalidation. It may be necessary to
> > > > reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices.
> > > > Back-patch to 9.6 (all supported versions).
> > > >
> > > > Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres
> > > > Freund.
> > > >
> > > > Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com
> > > >
> > >
> > > Unfortunately, this seems to have broken CLOBBER_CACHE_ALWAYS builds. Since
> > > this commit, initdb never completes due to infinite retrying over and over
> > > (on the first RelationBuildDesc call).
>
> Thanks for the report. I had added the debug_discard arguments of
> InvalidateSystemCachesExtended() and RelationCacheInvalidate() to make the new
> code survive a CREATE TABLE at debug_discard_caches=5. Apparently that's not
> enough for initdb. I'll queue a task to look at it.
The explanation was more boring than that. v13 and earlier have an additional
InvalidateSystemCaches() call site, which I neglected to update. Here's the
fix I intend to push.