Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

From: Allan Kamau <kamauallan(at)gmail(dot)com>
To: Kristjan Mustkivi <sonicmonkey(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
Date: 2022-10-27 07:40:46
Message-ID: CAF3N6oS6L0r=sw6PhSZDO0KL4kc9xoSGnU=qs2cvj6d+WRzRCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Oct 27, 2022 at 10:26 AM Allan Kamau <kamauallan(at)gmail(dot)com> wrote:

>
>
> On Thu, Oct 27, 2022 at 10:20 AM Kristjan Mustkivi <sonicmonkey(at)gmail(dot)com>
> wrote:
>
>> Dear community,
>>
>> Right after upgrading our postgres servers from 11.15 to 11.17 we
>> started to encounter problems with data. Namely, when the query hit
>> the index, it returned a single row; when the query hit a relation
>> directly, it returned more than one row. Attempt to REINDEX revealed
>> the underlying data had duplicates (unique index reindexing failed).
>>
>> Version facts:
>> we started out with 11.12
>> jan 2022 upgraded to 11.14
>> mar 2022 to 11.15
>> oct 2022 to 11.17
>>
>> We are not sure when this corruption actually happened. Could it be
>> related to the indexing bugs reported in
>> https://www.postgresql.org/docs/release/11.14/? And the condition only
>> became known to us after 11.17 rollout which can perhaps be explained
>> by the following: while 11.17 does not have any outstanding index
>> related fixes, then https://www.postgresql.org/docs/release/11.15/
>> mentions fix for index-only scans and so does
>> https://www.postgresql.org/docs/release/11.16/.
>>
>> The bottom line is we would like to understand if the index corruption
>> and its manifestation is explained by the above release fixes or is
>> there something else that should be investigated further here with the
>> help from the community.
>>
>> With best regards,
>> --
>> Kristjan Mustkivi
>>
>> Email: kristjan(dot)mustkivi(at)gmail(dot)com
>>
>>
>> Hi Kristjan,
> What if you construct a select statement containing the row id and the
> column which has the problematic index into a new table. Then perform
> queries on this table to test for uniqueness of the column on which the
> problematic index was reported.
>
> Allan.
>

How was the data "transfer" between upgrades done? Was it by dump and
restore?
If you have the 11.15 instance running having the data, you may do the
selection of the row id and the specific column which the index is based
into a new table and perform queries on this too to determine uniqueness of
the values therein. Likewise do the same for the 11.17 version.

Is it possible to build and install PG 15 from source on a different
directory (using --prefix ) then perform pg_dump using the binaries of this
installation into a directory. Then configure PG 15 installation to listen
on a different TCP/IP port to the one you are currently using with 11.17
instance. Once started, test to see if the index anomaly is present in the
PG 15 instance. Alternatively you may use the PG 15 docker image and docker
to start a PG 15 docker container for your tests instead of having to build
and install PG 15 for this test.

-Allan

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kristjan Mustkivi 2022-10-27 07:55:31 Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
Previous Message Allan Kamau 2022-10-27 07:26:07 Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12