From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel index creation does not properly cleanup after error |
Date: | 2018-03-11 19:41:40 |
Message-ID: | CAH2-WzmGuTdu58BgrEcbH8u1Y=r+u9iqFY5GHaQ=t3iZcOZLDQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Mar 11, 2018 at 3:22 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Due to the failure during the index build, it appears that the
> PG_TRY/PG_CATCH block in reindex_relation() causes the reindex_index()
> to abort and jump out to the catch block. Here there's a call to
> ResetReindexPending(), which complains as we're still left in parallel
> mode from the aborted _bt_begin_parallel() call which has called
> EnterParallelMode(), but not managed to make it all the way to
> _bt_end_parallel() (called from btbuild()), where ExitParallelMode()
> is normally called.
>
> Subsequent attempts to refresh the materialized view result in an
> Assert failure in list_member_oid()
Thanks for the report.
> I've not debugged that, but I assume it's because
> pendingReindexedIndexes is left as a non-empty list but has had its
> memory context obliterated due to the previous query having ended.
It's not really related to memory lifetime, so much as a corruption of
the state that tracks reindexed indexes within a backend. This is of
course due to that "cannot modify reindex state during a parallel
operation" error you saw.
> The comment in the following fragment is not well honored by the
> ResetReindexPending() since it does not clear the list if there's an
> error.
> A perhaps simple fix would be just to have ResetReindexPending() only
> reset the list to NIL again and not try to raise any error.
I noticed a very similar bug in ResetReindexProcessing() just before
parallel CREATE INDEX was committed. The fix there was simply not
throwing a "can't happen" error. I agree that the same fix should be
used here. It's not worth enforcing !IsInParallelMode() in the reset
functions; just enforcing !IsInParallelMode() in the set functions is
sufficient. Attached patch does this.
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-corruption-of-backend-REINDEX-processing-state.patch | text/x-patch | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-03-11 20:51:40 | Re: Using JIT for VACUUM, COPY, ANALYZE |
Previous Message | Andres Freund | 2018-03-11 19:38:54 | Re: Using JIT for VACUUM, COPY, ANALYZE |