Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Петър Славов <pet(dot)slavov(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Date: 2022-05-25 12:39:14
Message-ID: CA+TgmoYP7BnHXTEoJRUnOuie2J05Qo8ti5mnE2u-1c4Hk+8zJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, May 25, 2022 at 7:45 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > Basically snapshots don't work anymore. If PROC_IN_SAFE_IC is set,
> > that backend is ignored for the horizon computation for snapshots /
> > on-access HOT pruning. Which means that rows that are visible to the
> > snapshot can be pruned away.
>
> I wondered if we could have different tuple horizons for HOT pruning
> than for vacuum, but looking at ComputeXidHorizons() and users of that,
> it looks complicated to adapt.
>
> Another possibility (than reverting the commit altogether) might be to
> disable HOT pruning while a process is operating on that relation with
> PROC_IN_SAFE_IC. So CIC/RIC processes are still ignored for VACUUM,
> while not creating corrupted indexes.

I'm not sure that would be a win, because HOT pruning is great as long
as the tuples you're pruning are old enough. Also, it seems like it
would require complex new infrastructure that I think we should be
reluctant to invent in back branches.

It seems to me that we should just revert. As far as I can see, and
for sure I might be missing something, this is a classic case of an
idea that seemed good at the time but turns out not to work. When we
look at a recently HOT-updated tuple, we need to know whether the
original insertion happened before or after the index build. We can't
figure that out if we prune away the tuples that store the xmin values
that we need in order to figure that out. So it turns out we need
everyone to respect that snapshot after all. Bummer.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Färber 2022-05-25 12:55:36 Extension pg_trgm, permissions and pg_dump order
Previous Message Alvaro Herrera 2022-05-25 11:44:54 Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY