From: | Aleksander Alekseev <aleksander(at)timescale(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | feichanghong <feichanghong(at)qq(dot)com> |
Subject: | Re: "ERROR: could not open relation with OID 16391" error was encountered when reindexing |
Date: | 2024-01-16 12:06:34 |
Message-ID: | CAJ7c6TMM-6mqNzmd83dF51aBKaJ+__uM_sCR60yYGLGCFcp4gw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
> This issue has been reported in the <pgsql-bugs> list at the link below, but has not received a reply.
> https://www.postgresql.org/message-id/18286-f6273332500c2a62%40postgresql.org
> Hopefully to get some response from kernel hackers, thanks!
>
> Hi,
> When reindex the partitioned table's index and the drop index are executed concurrently, we may encounter the error "could not open relation with OID”.
>
> The reconstruction of the partitioned table's index is completed in multiple transactions and can be simply summarized into the following steps:
> 1. Obtain the oids of all partition indexes in the ReindexPartitions function, and then commit the transaction to release all locks.
> 2. Reindex each index in turn
> 2.1 Start a new transaction
> 2.2 Check whether the index still exists
> 2.3 Call the reindex_index function to complete the index rebuilding work
> 2.4 Submit transaction
>
> There is no lock between steps 2.2 and 2.3 to protect the heap table and index from being deleted, so whether the heap table still exists is determined in the reindex_index function, but the index is not checked.
>
> One fix I can think of is: after successfully opening the heap table in reindex_index, check again whether the index still exists, Something like this:
> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 143fae01eb..21777ec98c 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -3594,6 +3594,17 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
> if (!heapRelation)
> return;
>
> + /*
> + * Before opening the index, check if the index relation still exists.
> + * If index relation is gone, leave.
> + */
> + if (params->options & REINDEXOPT_MISSING_OK != 0 &&
> + !SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexId)))
> + {
> + table_close(heapRelation, NoLock);
> + return;
> + }
> +
> /*
> * Switch to the table owner's userid, so that any index functions are run
> * as that user. Also lock down security-restricted operations and
>
> The above analysis is based on the latest master branch.
>
> I'm not sure if my idea is reasonable, I hope you can give me some suggestions. Thanks.
Any chance you could provide minimal steps to reproduce the issue on
an empty PG instance, ideally as a script? That's going to be helpful
to reproduce / investigate the issue and also make sure that it's
fixed.
--
Best regards,
Aleksander Alekseev
From | Date | Subject | |
---|---|---|---|
Next Message | Anton A. Melnikov | 2024-01-16 12:07:13 | Re: ResourceOwner refactoring |
Previous Message | shveta malik | 2024-01-16 12:02:20 | Re: Synchronizing slots from primary to standby |