Re: An improvement of ProcessTwoPhaseBuffer logic

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An improvement of ProcessTwoPhaseBuffer logic
Date: 2025-01-17 03:16:15
Message-ID: Z4nLfw6hcdz0BTkP@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 16, 2025 at 06:44:16PM -0800, Noah Misch wrote:
> On Fri, Jan 17, 2025 at 11:04:03AM +0900, Michael Paquier wrote:
>> +typedef void (*TwoPhaseCallback) (FullTransactionId fxid, uint16 info,
>> void *recdata, uint32 len);
>>
>> Based on your latest patch, I doubt that we'll be able to do any of
>> that in the back-branches. That's much nicer in the long term to show
>> what this code relies on.
>
> The twophase.c bits of that patch turned out to be orthogonal to the key
> issue, so that's fine. I suspect the signature changes would be okay to
> back-patch to v17, subject to PGXN confirmation of no extension using those
> APIs. (The TwoPhaseCallback list is hard-coded; extensions can't add
> callbacks.) That said, the rationale for back-patching fxid stuff is gone.

I would not bet on this part.

>> - Let's first revert e358425 and 7e125b2. This indeed needs to be
>> reworked, and there is a release coming by.
>
> +1. e358425 is borderline, but it's hard to rule out ways of it making
> recovery unlink pg_twophase files improperly. Released code may also be doing
> that, but that code's seven years of history suggests it does so infrequently
> if at all. We don't have as much evidence about what the frequency would be
> under e358425.

Thanks. I'll got do that now as I can look at the buildfarm today,
then look at the rest.

>> - Your refactoring around xid8funcs.c is a good idea on its own. This
>> can be an independent patch.
>
> I got here on a yak shave for postgr.es/m/20250111214454.9a.nmisch@google.com.
> I expect that project will still want FullTransactionIdFromAllowableAt(). If
> so, I'll include it in that thread's patch series.

Okay.

> I'm not confident I could fix both that other thread's data loss bug and
> $SUBJECT in time for the 2024-02 releases. (By $SUBJECT, I mean the
> seven-year-old bug of interpreting pg_twophase before recovery consistency,
> which has a higher chance of causing spurious recovery failure starting in
> v17.) Would your or someone else like to fix $SUBJECT, before or after
> 2024-02 releases? I'd make time to review a patch.

Yes, I can work on these two and dig into the pieces, once the state
of the branch a bit cleaner. I'm not sure that a fully-ready patch
will be able to emerge within two weeks as this may need some
interations, but let's see. Backpatching something to remove the clog
lookups does not seem that complicated, actually, now that I'm looking
at it. Simpler in ~16 of course, but it does not seem that bad for
17~ as well if taken the simplest way.

>> Another point to consider is if we'd better switch 2PC files to store
>> a fxid rather than a xid.. Perhaps that's not necessary, but if we're
>> using FullTransactionIds across the full stack of twophase.c that
>> could make the whole better with even less xid <-> fxid translations
>> in all these paths. There is always the counter-argument of the extra
>> 4 bytes in the 2PC files and the records, though.
>
> Yes, perhaps so. It could also be an option to store it only in the
> pg_twophase file, not in the WAL record. Files are a lot rarer.

It would be slightly simpler to do both, I guess. Less translations
between fxids and xids involved this way.

> Similarly, I wondered if pg_twophase files should contain the LSN of the
> XLOG_XACT_PREPARE record, which would make the file's oldness unambiguous in a
> way fxid doesn't achieve. I didn't come up with a problem whose solution
> requires it, though.

Not sure about this one.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message 赵宇鹏 (宇彭) 2025-01-17 03:18:45 Automatic update of time column
Previous Message Nathan Bossart 2025-01-17 03:03:43 Re: convert libpgport's pqsignal() to a void function