From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Петър Славов <pet(dot)slavov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY |
Date: | 2022-05-24 18:46:54 |
Message-ID: | 20220524184654.c2zt6coy4s5a6rnh@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
On 2022-05-24 10:38:14 -0700, Peter Geoghegan wrote:
> On Tue, May 24, 2022 at 9:37 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Do we have any idea what really causes the corruption?
>
> I don't think so.
I think I found it: https://postgr.es/m/20220524183705.cmgbqq32z63qynhe%40alap3.anarazel.de
afaict PROC_IN_SAFE_IC is completely broken right now. Any concurrent prune
can remove prune rows that are visible to the snapshot held by the
PROC_IN_SAFE_IC backend. Which basically makes them "fair weather snapshots" -
they work only as long as there is no concurrent activity.
Similar behavior is fine for VACUUM - it doesn't use a snapshot / need a
consistent view of the table. But not for CIC - otherwise it could just use
SnapshotAny or such.
I don't really see a realistic alternative other than reverting at this
point. I think this needs to be rethought fairly fundamentally.
> Andrey's tap test fails for me on 14 as expected, and does so reliably
> -- so there is a fairly good reproducer for this.
>
> I don't have time to debug this right now (...), but it would probably be
> straightforward to get an RR recording of the failure.
I tried that, but it didn't repro under rr within 15min or so.
> (need to work on my pgCon talk)
Good luck :)
> > One thing that'd be worth excluding is the use of parallel index builds.
>
> I can rule out a problem with parallel index builds -- disabling them
> in the tap test doesn't alter the outcome.
Good. Just to clarify: I was suspicious of PROC_IN_SAFE_IC being set
incoherently in parallel workers or such, not of parallel index builds "in
general".
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2022-05-24 18:56:27 | BUG #17496: to_char function resets if interval exceeds 23 hours 59 minutes |
Previous Message | Andrey Borodin | 2022-05-24 18:38:07 | Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY |