From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: vac_truncate_clog()'s bogus check leads to bogusness |
Date: | 2023-06-24 01:41:58 |
Message-ID: | 20230624014158.mdgrlt4knsd3br7c@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-06-21 21:50:39 -0700, Noah Misch wrote:
> On Wed, Jun 21, 2023 at 05:46:37PM -0700, Andres Freund wrote:
> > A related issue is that as far as I can tell the determination of what is
> > bogus is bogus.
> >
> > The relevant cutoffs are determined vac_update_datfrozenxid() using:
> >
> > /*
> > * Identify the latest relfrozenxid and relminmxid values that we could
> > * validly see during the scan. These are conservative values, but it's
> > * not really worth trying to be more exact.
> > */
> > lastSaneFrozenXid = ReadNextTransactionId();
> > lastSaneMinMulti = ReadNextMultiXactId();
> >
> > but doing checks based on thos is bogus, because:
> >
> > a) a concurrent create table / truncate / vacuum can update
> > pg_class.relfrozenxid of some relation in the current database to a newer
> > value, after lastSaneFrozenXid already has been determined. If that
> > happens, we skip updating pg_database.datfrozenxid.
> >
> > b) A concurrent vacuum in another database, ending up in vac_truncate_clog(),
> > can compute a newer datfrozenxid. In that case the vac_truncate_clog() with
> > the outdated lastSaneFrozenXid will not truncate the clog (and also forget
> > to release WrapLimitsVacuumLock currently, as reported upthread) and not
> > call SetTransactionIdLimit(). The latter is particularly bad, because that
> > means we might not come out of "database is not accepting commands" land.
>
> > I guess we could just recompute the boundaries before actually believing the
> > catalog values are bogus?
>
> That's how I'd do it.
I was looking at doing that and got confused by the current code. Am I missing
something, or does vac_truncate_clog() have two pretty much identical attempts
at a safety measures?
void
vac_update_datfrozenxid(void)
...
lastSaneFrozenXid = ReadNextTransactionId();
...
vac_truncate_clog(newFrozenXid, newMinMulti,
lastSaneFrozenXid, lastSaneMinMulti);
}
...
static void
vac_truncate_clog(TransactionId frozenXID,
MultiXactId minMulti,
TransactionId lastSaneFrozenXid,
MultiXactId lastSaneMinMulti)
{
TransactionId nextXID = ReadNextTransactionId();
...
/*
* If things are working properly, no database should have a
* datfrozenxid or datminmxid that is "in the future". However, such
* cases have been known to arise due to bugs in pg_upgrade. If we
* see any entries that are "in the future", chicken out and don't do
* anything. This ensures we won't truncate clog before those
* databases have been scanned and cleaned up. (We will issue the
* "already wrapped" warning if appropriate, though.)
*/
if (TransactionIdPrecedes(lastSaneFrozenXid, datfrozenxid) ||
MultiXactIdPrecedes(lastSaneMinMulti, datminmxid))
bogus = true;
if (TransactionIdPrecedes(nextXID, datfrozenxid))
frozenAlreadyWrapped = true;
lastSaneFrozenXid is a slightly older version of ReadNextTransactionId(),
that's the only difference afaict.
I guess this might be caused by 78db307bb23 adding the check, but using
GetOldestXmin(NULL, true) to determine lastSaneFrozenXid. That was changed
soon after, in 87f830e0ce03.
Am I missing something?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-06-24 01:48:13 | Re: vac_truncate_clog()'s bogus check leads to bogusness |
Previous Message | David G. Johnston | 2023-06-24 01:28:26 | Re: psql: Add role's membership options to the \du+ command |