Re: An improvement of ProcessTwoPhaseBuffer logic

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An improvement of ProcessTwoPhaseBuffer logic
Date: 2024-12-30 01:08:31
Message-ID: Z3Hyj3kvkj3b9dFi@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 27, 2024 at 06:16:24PM +0300, Vitaly Davydov wrote:
> In general, I like your solution to use FullTransactionId. I haven't
> found any evident problems. I just would like to propose to create a
> couple of new functions like RemoveTwoPhaseFileInCurrentEpoch,
> TwoPhaseFilePathInCurrentEpoch which accept TransactionId instead
> FullTransactionId. It may make the code clearer and result into less
> boilerplate code. I tried to do some hand testing. It seems, the
> problem is gone with the patch.

Thanks for the review. Agreed that it would be a good thing to limit
the number of paths calling ReadNextFullTransactionId(), but I did not
like much the suggestion TwoPhaseFilePathInCurrentEpoch(), feeling
that it was more important to keep a single code path in charge of
building the file names. Instead, I have gone with a new
FullTransactionIdFromCurrentEpoch() that replaces
AdjustToFullTransactionId(). It cleans up most of the calls to
ReadNextFullTransactionId() compared to the previous patch. It is
true that these couple with calls to ProcessTwoPhaseBuffer(), but the
result felt OK this way in the scope of this fix.

> As an idea, I would like to propose to store FullTransactionId in
> global transaction state instead of TransactionId. I'm not sure, it
> will consume significant additional memory, but it make the code
> more clear and probably result into less number of locks.

FWIW, I was wondering about doing the same thing. However, I have
concluded that this some refactoring work out of the scope of fixing
the primary issue we have here, as we are hit by the way the file
names are built when we attempt to remove them.

Note that the memory footprint of storing a FullTransactionId in
twophase.c's GlobalTransactionData does not worry me much. It is more
important to not increase the size of the two-phase state data,
impacting files on disk and their WAL records. This size argument
has been mentioned on the thread that has added epochs to the 2PC file
names, as far as I recall.

At the end, I have applied two patches, the first one down to 13 that
took care of the "future" issue, with tests added in the v14~v16
range. v13 was lacking a perl routine, and it's not a big deal to not
have coverage for this code path anyway. The second patch has been
applied down to v17, to fix the epoch issue, with the more advanced
tests.

If you have a suggestion of patch to plug in a FullTransactionId into
GlobalTransactionData rather than a TransactionId, feel free to
propose something! Help is always welcome, and this would be
HEAD-only work, making it less urgent to deal with.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-12-30 01:15:19 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Tom Lane 2024-12-29 22:56:38 Re: IANA timezone abbreviations versus timezone_abbreviations