Re: Issues with 2PC at recovery: CLOG lookups and GlobalTransactionData

From: Noah Misch <noah(at)leadboat(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Subject: Re: Issues with 2PC at recovery: CLOG lookups and GlobalTransactionData
Date: 2025-02-19 00:57:47
Message-ID: 20250219005747.26.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2025 at 03:36:20PM +0900, Michael Paquier wrote:
> And I am beginning a new thread about going through an issue that Noah
> has mentioned at [1], which is that the 2PC code may attempt to do
> CLOG lookups at very early stage of recovery, where the cluster is not
> in a consistent state.

It's broader than CLOG lookups. I wrote in [1], "We must not read the old
pg_twophase file, which may contain an unfinished write." Until recovery
reaches consistency, none of the checks in ProcessTwoPhaseBuffer() or its
callee ReadTwoPhaseFile() are safe.

> [1]: https://www.postgresql.org/message-id/20250117005221.05.nmisch@google.com
> [2]: https://www.postgresql.org/message-id/20250116205254.65.nmisch@google.com

On Fri, Jan 31, 2025 at 09:21:53AM +0900, Michael Paquier wrote:
> --- a/src/test/recovery/t/009_twophase.pl
> +++ b/src/test/recovery/t/009_twophase.pl

> + log_like => [
> + qr/removing stale two-phase state from memory for transaction $commit_prepared_xid of epoch 0/,
> + qr/removing stale two-phase state from memory for transaction $abort_prepared_xid of epoch 0/
> + ]);

> + log_like => [
> + qr/removing past two-phase state file for transaction 4095 of epoch 238/,
> + qr/removing future two-phase state file for transaction 4095 of epoch 511/
> + ]);

As I wrote in [1], "By the time we reach consistency, every file in
pg_twophase will be applicable (not committed or aborted)." If we find
otherwise, the user didn't follow the backup protocol (or there's another
bug). Hence, long-term, we should stop these removals and just fail recovery.
We can't fix all data loss consequences of not following the backup protocol,
so the biggest favor we can do the user is draw their attention to the
problem. How do you see it?

For back branches, the ideal is less clear. If we can convince ourselves that
enough of these events will indicate damaging problems (user error, hardware
failure, or PostgreSQL bugs), the long-term ideal of failing recovery is also
right for back branches. However, it could be too hard to convince ourselves
of that. If so, that could justify keeping these removals in back branches.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-02-19 00:58:02 Re: Unsafe access BufferDescriptors array in BufferGetLSNAtomic()
Previous Message Michael Paquier 2025-02-19 00:53:09 Re: test_escape: invalid option -- 'c'