Re: An improvement of ProcessTwoPhaseBuffer logic

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An improvement of ProcessTwoPhaseBuffer logic
Date: 2025-01-17 02:04:03
Message-ID: Z4m6k--M1AvAWuRF@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 16, 2025 at 04:52:21PM -0800, Noah Misch wrote:
> In other words, the root problem that led to commits e358425 and 7e125b2 was
> recovery interpreting pg_twophase file content before reaching consistency.
> We can't make the ProcessTwoPhaseBuffer() checks safe before reaching
> consistency. Is that right? Once ProcessTwoPhaseBuffer() stops running
> before consistency, it won't strictly need the range checks. We may as well
> have something like those range checks as a best-effort way to detect base
> backup spec violations.

Oh, yeah, it seems like you have a very good point here.
ProcessTwoPhaseBuffer() is an effort to use a unique path for the
handling of this 2PC data, and we cannot do that safely at this stage.
The problem is restoreTwoPhaseData(), which is called very early in
the recovery process, while all the other code paths calling
ProcessTwoPhaseBuffer() are done when we're doing reading WAL after
we've reached consistency as far as I can see. This is a wrong
concept since 728bd991c3c4. 5a1dfde8334b has made that much harder to
think about, while not integrating fully things. It also looks like
we may be able to just get rid of the CLOG checks entirely if we
overwrite existing files as long as we are not in a consistent state,
as you say. Hmm.

I agree that e358425 and 7e125b2 are weird attempts that just try to
work around the real issue, and that they're making some cases wrong
while on it with potential file removals.

+typedef void (*TwoPhaseCallback) (FullTransactionId fxid, uint16 info,
void *recdata, uint32 len);

Based on your latest patch, I doubt that we'll be able to do any of
that in the back-branches. That's much nicer in the long term to show
what this code relies on.

I've been thinking about the strategy to use here, and here are some
rough steps:
- Let's first revert e358425 and 7e125b2. This indeed needs to be
reworked, and there is a release coming by.
- Your refactoring around xid8funcs.c is a good idea on its own. This
can be an independent patch.
- Let's figure out how much surgery is required with a focus on HEAD
for now (I'm feeling that more non-backpatchable pieces are going to
be needed here), then extract the relevant pieces that could be
backpatchable. Hard to say if this is going to be doable at this
stage, or even worth doing, but it will be possible to dig into the
details once we're happy with the state of HEAD. My first intuition
is that the risk of touching the back-branches may not be worth the
potential reward based on the lack of field complaints (not seen
customer cases, tbh): high risk, low reward.

Another point to consider is if we'd better switch 2PC files to store
a fxid rather than a xid.. Perhaps that's not necessary, but if we're
using FullTransactionIds across the full stack of twophase.c that
could make the whole better with even less xid <-> fxid translations
in all these paths. There is always the counter-argument of the extra
4 bytes in the 2PC files and the records, though.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-01-17 02:14:40 Re: Pgoutput not capturing the generated columns
Previous Message Peter Smith 2025-01-17 01:53:06 Re: Pgoutput not capturing the generated columns