Re: "ERROR: could not open relation with OID 16391" error was encountered when reindexing

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: feichanghong <feichanghong(at)qq(dot)com>
Subject: Re: "ERROR: could not open relation with OID 16391" error was encountered when reindexing
Date: 2024-01-16 12:06:34
Message-ID: CAJ7c6TMM-6mqNzmd83dF51aBKaJ+__uM_sCR60yYGLGCFcp4gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> This issue has been reported in the <pgsql-bugs> list at the link below, but has not received a reply.
> https://www.postgresql.org/message-id/18286-f6273332500c2a62%40postgresql.org
> Hopefully to get some response from kernel hackers, thanks!
>
> Hi,
> When reindex the partitioned table's index and the drop index are executed concurrently, we may encounter the error "could not open relation with OID”.
>
> The reconstruction of the partitioned table's index is completed in multiple transactions and can be simply summarized into the following steps:
> 1. Obtain the oids of all partition indexes in the ReindexPartitions function, and then commit the transaction to release all locks.
> 2. Reindex each index in turn
> 2.1 Start a new transaction
> 2.2 Check whether the index still exists
> 2.3 Call the reindex_index function to complete the index rebuilding work
> 2.4 Submit transaction
>
> There is no lock between steps 2.2 and 2.3 to protect the heap table and index from being deleted, so whether the heap table still exists is determined in the reindex_index function, but the index is not checked.
>
> One fix I can think of is: after successfully opening the heap table in reindex_index, check again whether the index still exists, Something like this:
> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 143fae01eb..21777ec98c 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -3594,6 +3594,17 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
> if (!heapRelation)
> return;
>
> + /*
> + * Before opening the index, check if the index relation still exists.
> + * If index relation is gone, leave.
> + */
> + if (params->options & REINDEXOPT_MISSING_OK != 0 &&
> + !SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexId)))
> + {
> + table_close(heapRelation, NoLock);
> + return;
> + }
> +
> /*
> * Switch to the table owner's userid, so that any index functions are run
> * as that user. Also lock down security-restricted operations and
>
> The above analysis is based on the latest master branch.
>
> I'm not sure if my idea is reasonable, I hope you can give me some suggestions. Thanks.

Any chance you could provide minimal steps to reproduce the issue on
an empty PG instance, ideally as a script? That's going to be helpful
to reproduce / investigate the issue and also make sure that it's
fixed.

--
Best regards,
Aleksander Alekseev

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anton A. Melnikov 2024-01-16 12:07:13 Re: ResourceOwner refactoring
Previous Message shveta malik 2024-01-16 12:02:20 Re: Synchronizing slots from primary to standby