Re: vac_truncate_clog()'s bogus check leads to bogusness

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: vac_truncate_clog()'s bogus check leads to bogusness
Date: 2023-06-22 00:46:37
Message-ID: 20230622004637.r4dsh5mgzidx5yix@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-06-21 15:12:08 -0700, Andres Freund wrote:
> When vac_truncate_clog() returns early, due to one of these paths:
>
> [...]
>
> Separately, I think it's quite bad that we *silently* return from
> vac_truncate_clog() when finding a bogus xid. That's a quite severe condition,
> we should at least tell the user about it.

A related issue is that as far as I can tell the determination of what is
bogus is bogus.

The relevant cutoffs are determined vac_update_datfrozenxid() using:

/*
* Identify the latest relfrozenxid and relminmxid values that we could
* validly see during the scan. These are conservative values, but it's
* not really worth trying to be more exact.
*/
lastSaneFrozenXid = ReadNextTransactionId();
lastSaneMinMulti = ReadNextMultiXactId();

but doing checks based on thos is bogus, because:

a) a concurrent create table / truncate / vacuum can update
pg_class.relfrozenxid of some relation in the current database to a newer
value, after lastSaneFrozenXid already has been determined. If that
happens, we skip updating pg_database.datfrozenxid.

b) A concurrent vacuum in another database, ending up in vac_truncate_clog(),
can compute a newer datfrozenxid. In that case the vac_truncate_clog() with
the outdated lastSaneFrozenXid will not truncate the clog (and also forget
to release WrapLimitsVacuumLock currently, as reported upthread) and not
call SetTransactionIdLimit(). The latter is particularly bad, because that
means we might not come out of "database is not accepting commands" land.

I think in both cases a later call might fix the issue, but that could be some
way out, if autovacuum doesn't see further writes being necessary, and no
further write activity happens, because of ""database is not accepting
commands".

It's not entirely obvious to me how to best fix these. For a second I thought
we just need to acquire a snapshot before determining the sane values, but
that doesn't work, since we update the relevant fields with
heap_inplace_update().

I guess we could just recompute the boundaries before actually believing the
catalog values are bogus?

I think we also add warnings to these paths, so we actually have a chance to
find problems in the field.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-06-22 00:50:54 Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX
Previous Message Peter Geoghegan 2023-06-22 00:42:20 Re: Adding further hardening to nbtree page deletion