Inval reliability, especially for inplace updates

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Inval reliability, especially for inplace updates
Date: 2024-05-23 00:05:48
Message-ID: 20240523000548.58.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote:
> Separable, nontrivial things not fixed in the attached patch stack:
>
> - Inplace update uses transactional CacheInvalidateHeapTuple(). ROLLBACK of
> CREATE INDEX wrongly discards the inval, leading to the relhasindex=t loss
> still seen in inplace-inval.spec. CacheInvalidateRelmap() does this right.

I plan to fix that like CacheInvalidateRelmap(): send the inval immediately,
inside the critical section. Send it in heap_xlog_inplace(), too. The
interesting decision is how to handle RelationCacheInitFilePreInvalidate(),
which has an unlink_initfile() that can fail with e.g. EIO. Options:

1. Unlink during critical section, and accept that EIO becomes PANIC. Replay
may reach the same EIO, and the system won't reopen to connections until
the storage starts cooperating.a Interaction with checkpoints is not ideal.
If we checkpoint and then crash between inplace XLogInsert() and inval,
we'd be relying on StartupXLOG() -> RelationCacheInitFileRemove(). That
uses elevel==LOG, so replay would neglect to PANIC on EIO.

2. Unlink before critical section, so normal xact abort suffices. This would
hold RelCacheInitLock and a buffer content lock at the same time. In
RecordTransactionCommit(), it would hold RelCacheInitLock and e.g. slru
locks at the same time.

The PANIC risk of (1) seems similar to the risk of PANIC at
RecordTransactionCommit() -> XLogFlush(), which hasn't been a problem. The
checkpoint-related risk bothers me more, and (1) generally makes it harder to
reason about checkpoint interactions. The lock order risk of (2) feels
tolerable. I'm leaning toward (2), but that might change. Other preferences?

Another decision is what to do about LogLogicalInvalidations(). Currently,
inplace update invalidations do reach WAL via LogLogicalInvalidations() at the
next CCI. Options:

a. Within logical decoding, cease processing invalidations for inplace
updates. Inplace updates don't affect storage or interpretation of table
rows, so they don't affect logicalrep_write_tuple() outcomes. If they did,
invalidations wouldn't make it work. Decoding has no way to retrieve a
snapshot-appropriate version of the inplace-updated value.

b. Make heap_decode() of XLOG_HEAP_INPLACE recreate the invalidation. This
would be, essentially, cheap insurance against invalidations having a
benefit I missed in (a).

I plan to pick (a).

> - AtEOXact_Inval(true) is outside the RecordTransactionCommit() critical
> section, but it is critical. We must not commit transactional DDL without
> other backends receiving an inval. (When the inplace inval becomes
> nontransactional, it will face the same threat.)

This faces the same RelationCacheInitFilePreInvalidate() decision, and I think
the conclusion should be the same as for inplace update.

Thanks,
nm

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-05-23 00:21:32 Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)
Previous Message Martijn Wallet 2024-05-22 23:29:35 Re: processes stuck in shutdown following OOM/recovery