From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Issues with 2PC at recovery: CLOG lookups and GlobalTransactionData |
Date: | 2025-02-02 23:59:55 |
Message-ID: | Z6AG-ynJPP8ikVKU@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 31, 2025 at 04:54:17PM +0300, Vitaly Davydov wrote:
> I'm looking at the v13 patch. I see, there is the only file for v13:
> v2-0002-Fix-issues-with-2PC-file-handling-at-recovery-13.txt
Yes. The fixes in 13~16 are simpler. There are so many conflicts
across all the branches that I'm planning to maintain branches for
each stable versions, depending on the discussion going on. The
refactoring of 17~ and HEAD could be done first, I suppose.
> #1. In RecoverPreparedTransactions we iterave over all in-memory two phase
> states and check every xid state in CLOG unconditionally. Image, we have
> a very old transaction that is close to the oldestXid. Will CLOG state be
> available for such transaction? I'm not sure about it.
Yes. RecoverPreparedTransactions() where the cluster is officially at
the end of recovery, and we've checked that we are consistent.
> #2. In restoreTwoPhaseData we load all the twostate files that are in
> the valid xid range (from oldestXid to nextFullXid in terms of logical
> comparision of xids with epoch). The question - should we load files
> which xids greater than ControlFile->checkPointCopy.nextXid
> (xids after last checkpoint). In general, all twophase files should belong
> to xids before the last checkpoint. I guess, we should probably ignore
> files which xid is equal or greater than the xid of the last checkpoint -
> twostate data should be in the WAL. If not, I guess, we may see error messages
> like show below when doing xact_redo -> PrepareRedoAdd:
We are already using the nextFullXid from ShmemVariableCache, which is
something that StartupXLOG() fetches from the checkpoint record we
define as the starting point of recovery, which provides a protection
good enough. Grabbing this information from checkPointCopy.nextXid
would be actually worse when recovering from a backup_label, no? The
copy of the checkpoint record in the control file would be easily
newer than the checkpoint record retrieved from the LSN in the
backup_label, leading to less files considered as in the future. I'd
rather trust what we have in WAL a maximum for this data.
Note that there is a minor release next week on the roadmap, and for
clarity I am not planning to rush this stuff in the tree this week:
https://www.postgresql.org/developer/roadmap/
Let's give it more time to find a solution we're all OK with. I'd
love to hear from Noah here as well, as he got involved in the
discussion of the other thread.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2025-02-03 00:28:40 | meson's in-tree libpq header search order vs -Dextra_include_dirs |
Previous Message | Peter Smith | 2025-02-02 22:58:04 | Re: Pgoutput not capturing the generated columns |