Re: heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead bug (Was: amcheck (B-Tree integrity checking tool))

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "Wood, Dan" <hexpert(at)amazon(dot)com>, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead bug (Was: amcheck (B-Tree integrity checking tool))
Date: 2017-12-13 21:45:45
Message-ID: CAH2-Wz=3h22nWOx4OZRngVjbjgEP8P7QEgytaoXo9zoQ_=z=fA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 18, 2017 at 12:45 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Bringing it back to the concrete freeze-the-dead issue, and the
> question of an XID-cutoff for safely interrogating CLOG: I guess it
> will be possible to assess a HOT chain as a whole. We can probably do
> this safely while holding a super-exclusive lock on the buffer. I can
> probably find a way to ensure this only needs to happen in a rare slow
> path, when it looks like the invariant might be violated but we need
> to make sure (I'm already following this pattern in a couple of
> places). Realistically, there will be some amount of "try it and see"
> here.

I would like to point out for the record/archives that I now believe
that Andres' pending do-over fix for the "Freeze the dead" bug [1]
will leave things in *much* better shape when it comes to
verification. Andres' patch neatly addresses *all* of the concerns
that I raised on this thread. The high-level idea of relfrozenxid as a
unambiguous cut-off point at which it must be safe to interrogate the
CLOG is restored.

Off hand, I'd say that the only interlock amcheck verification now
needs when verifying heap pages against the CLOG is a VACUUM-style
SHARE UPDATE EXCLUSIVE lock on the heap relation being verified. Every
heap tuple must either be observed to be frozen, or must only have
hint bits that are observably in agreement with CLOG. The only
complicated part is the comment that explains why this is
comprehensive and correct (i.e. does not risk false positives or false
negatives). We end up with something that is a bit like a "correct by
construction" design.

The fact that Andres also proposes to add a bunch of new defensive
"can't happen" hard elog()s (mostly by promoting assertions) should
validate the design of tuple + multixact freezing, in the same way
that I hope amcheck can.

[1] https://postgr.es/m/20171114030341.movhteyakqeqx5pm@alap3.anarazel.de
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-13 21:53:46 Re: Top-N sorts verses parallelism
Previous Message Andres Freund 2017-12-13 21:37:54 Re: pgsql: Provide overflow safe integer math inline functions.