From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Floris Van Nee <florisvannee(at)optiver(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
Subject: | Re: visibility map corruption |
Date: | 2021-07-24 00:47:18 |
Message-ID: | CAH2-WznU9L5K8PFAKdaPuFgPB9wUWq6Ps_OQm=KNPgR+Rkxk4A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 23, 2021 at 5:08 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> However, I am now stuck on the commit message text, and I think this is
> the point Peter Geoghegan was trying to make earlier --- while we know
> that preserving the oldest xid in pg_control is the right thing to do,
> and that setting it to the current xid - 2 billion (the old behavior)
> causes vacuum freeze to run on all tables, but what else does this patch
> affect?
As far as I know the only other thing that it might affect is the
traditional use of pg_resetwal: recovering likely-corrupt data.
Getting the database to limp along for long enough to pg_dump. That is
the only interpretation that makes sense, because the code in question
predates pg_upgrade.
AFAICT that was the original spirit of the code that we're changing here.
> As far as I know, seeing a very low oldest xid causes autovacuum to
> check all objects and make sure their relfrozenxid is less then
> autovacuum_freeze_max_age, but isn't that just a check? Would that
> cause any table scans? I would think not. And would this cause
> incorrect truncation of pg_xact or fsm or vm files? I would think not
> too.
Tom actually wrote this code. I believe that he questioned the whole
basis of it himself quite recently.
Whether or not it's okay to change the behavior in contexts outside of
pg_upgrade (contexts where the user invokes pg_resetwal -x to get the
system to start) is perhaps debatable. It probably doesn't matter very
much if you preserve that behavior for non-pg_upgrade cases -- hard to
say. At the same time it's now easy to see that pg_upgrade shouldn't
be doing this.
> Even if the old and new cluster had mismatched autovacuum_freeze_max_age
> values, I don't see how that would cause any corruption either.
Sometimes the pg_control value for oldest XID is used as the oldest
non-frozen XID that's expected in the table. Other times it's
relfrozenxid itself IIRC.
> I could perhaps see corruption happening if pg_control's oldest xid
> value was closer to the current xid value than it should be, but I can't
> see how having it 2-billion away could cause harm, unless perhaps
> pg_upgrade itself used enough xids to cause the counter to wrap more
> than 2^31 away from the oldest xid recorded in pg_control.
>
> What I am basically asking is how to document this and what it fixes.
ISTM that this is a little like commits 78db307bb2 and a61daa14. Maybe
take a look at those?
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2021-07-24 01:01:18 | Re: visibility map corruption |
Previous Message | Bruce Momjian | 2021-07-24 00:08:52 | Re: visibility map corruption |