datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: datfrozenxid > relfrozenxid w/ crash before XLOG_HEAP_INPLACE
Date: 2024-06-20 01:29:08
Message-ID: 20240620012908.92.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> Separable, nontrivial things not fixed in the attached patch stack:

> - Trouble is possible, I bet, if the system crashes between the inplace-update
> memcpy() and XLogInsert(). See the new XXX comment below the memcpy().

That comment:

/*----------
* XXX A crash here can allow datfrozenxid() to get ahead of relfrozenxid:
*
* ["D" is a VACUUM (ONLY_DATABASE_STATS)]
* ["R" is a VACUUM tbl]
* D: vac_update_datfrozenid() -> systable_beginscan(pg_class)
* D: systable_getnext() returns pg_class tuple of tbl
* R: memcpy() into pg_class tuple of tbl
* D: raise pg_database.datfrozenxid, XLogInsert(), finish
* [crash]
* [recovery restores datfrozenxid w/o relfrozenxid]
*/

> Might solve this by inplace update setting DELAY_CHKPT, writing WAL, and
> finally issuing memcpy() into the buffer.

That fix worked. Along with that, I'm attaching a not-for-commit patch with a
test case and one with the fix rebased on that test case. Apply on top of the
v2 patch stack from https://postgr.es/m/20240617235854.f8.nmisch@google.com.
This gets key testing from 027_stream_regress.pl; when I commented out some
memcpy lines of the heapam.c change, that test caught it.

This resolves the last inplace update defect known to me.

Thanks,
nm

Attachment Content-Type Size
inplace180-datfrozenxid-overtakes-relfrozenxid-v1.patch text/plain 5.5 KB
demo-inplace170-datfrozenxid-overtakes-relfrozenxid-test-v1.patch text/plain 15.4 KB
demo-inplace180-datfrozenxid-overtakes-relfrozenxid-fix-v1.patch text/plain 5.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-06-20 03:33:14 Re: Pgoutput not capturing the generated columns
Previous Message Masahiko Sawada 2024-06-20 01:11:37 Re: suspicious valgrind reports about radixtree/tidstore on arm64