Re: Changing the state of data checksums in a running cluster

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Michael Banck <mbanck(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Changing the state of data checksums in a running cluster
Date: 2025-03-13 09:54:28
Message-ID: 5AED03AE-C15F-43DD-93AE-CCD6F3DC6A07@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 12 Mar 2025, at 14:16, Tomas Vondra <tomas(at)vondra(dot)me> wrote:

> I continued investigating this and experimenting with alternative
> approaches, and I think the way the patch relies on ControlFile is not
> quite right. That is, it always sets data_checksum_version to the last
> ("current") value, but that's not what ControlFile is for ...

Agreed, that's a thinko on my part. Reading it makes it clear, but I had
failed to see that when hacking =/

> XLogCtl seemed like a good place, so I used that - after all, it's a
> value from XLOG. Maybe there's a better place? I'm open to suggestions,
> but it does not really affect the overall approach.

Seems like a good place for it.

> So all the places now update XLogCtl->data_checksums_version instead of
> the ControlFile, and also query this flag for *current* value.
>
> The value is copied from XLogCtl to ControlFile when creating checkpoint
> (or restartpoint), and the control file is persisted. This means (a) the
> current value can't get written to the control file prematurely, and (b)
> the value is consistent with the checkpoint (i.e. with the LSN where we
> start crash recovery, if needed).

+1

> The attached 0005 patch implements this. It's a bit WIP and I'm sure it
> can be improved, but I'm yet to see a single crash/failure with it. With
> the original patch I've seen crashes after 5-10 loops (i.e. a couple
> minutes), I'm now at loop 1000 and it's still OK.

Given how successful this test has been at stressing out errors that is indeed
comforting to hear.

> I believe the approach is correct, but the number of possible states
> (e.g. after a crash/restart) seems a bit complex. I wonder if there's a
> better way to handle this, but I can't think of any. Ideas?

Not sure if this moves the needle too far in terms of complexity wrt to the
previous version of the patch, there were already multiple copies.

> One issue I ran into is the postmaster does not seem to be processing
> the barriers, and thus not getting info about the data_checksum_version
> changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

> That's fine until it needs to launch a child process (e.g. a
> walreceiver), which will then see the LocalDataChecksumVersion as of the
> start of the instance, not the "current" one. I fixed this by explicitly
> refreshing the value in postmaster_child_launch(), but maybe I'm missing
> something. (Also, EXEC_BACKEND may need to handle this too.)

The pg_checksums test is failing for me on this version due to the GUC not
being initialized, don't we need something like the below as well? (With a
comment explaining why ReadControlFile wasn't enough.)

@@ -5319,6 +5319,7 @@ LocalProcessControlFile(bool reset)
Assert(reset || ControlFile == NULL);
ControlFile = palloc(sizeof(ControlFileData));
ReadControlFile();
+ SetLocalDataChecksumVersion(ControlFile->data_checksum_version);

A few comments on the patchset:

+ * Local state fror Controlfile data_checksum_version. After initialization
s/fror/for/. Also, this is no longer true as it's a local copy of the XlogCtl
value and not the Controlfile value (which may or may not be equal).

- if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+ if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
ereport(WARNING,
(errmsg("data checksums are being enabled, but no worker is running"),
errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
Reading this made me realize what a terrible error message I had placed there,
the hint is good but the message says checksums are being enabled but they're
not being enabled. Maybe "data checksums are marked as being in-progress, but
no worker is running"

+uint32
+GetLocalDataChecksumVersion(void)
+{
+ return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+ return XLogCtl->data_checksum_version;
+}
I wonder if CachedDataChecksumVersion would be more appropriate to distinguish
it from the Current value, and also to make appear less like actual copies of
controlfile values like LocalMinRecoveryPoint. Another thought is if we should
have the GetLocalDataChecksumVersion() API? GetCurrentDataChecksumVersion()
should be a better API no?

--
Daniel Gustafsson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2025-03-13 10:01:46 Re: Parallel safety docs for CTEs
Previous Message Amit Kapila 2025-03-13 09:50:01 Re: Add an option to skip loading missing publication to avoid logical replication failure