Re: race condition when writing pg_control

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: race condition when writing pg_control
Date: 2024-05-16 18:35:58
Message-ID: 20240516183558.GA1620903@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 16, 2024 at 12:19:22PM -0400, Melanie Plageman wrote:
> Today, after committing a3e6c6f, I saw recovery/018_wal_optimize.pl
> fail and see this message in the replica log [2].
>
> 2024-05-16 15:12:22.821 GMT [5440][not initialized] FATAL: incorrect
> checksum in control file
>
> I'm pretty sure it's not related to my commit. So, I was looking for
> existing reports of this error message.

Yeah, I don't see how it could be related.

> It's a long shot, since 0001 and 0002 were already pushed, but this is
> the only recent report I could find of "FATAL: incorrect checksum in
> control file" in pgsql-hackers or bugs archives.
>
> I do see this thread from 2016 [3] which might be relevant because the
> reported bug was also on Windows.

I suspect it will be difficult to investigate this one too much further
unless we can track down a copy of the control file with the bad checksum.
Other than searching for any new code that isn't doing the appropriate
locking, maybe we could search the buildfarm for any other occurrences. I
also seem some threads concerning whether the way we are reading/writing
the control file is atomic.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-05-16 18:50:50 Re: race condition when writing pg_control
Previous Message Robert Haas 2024-05-16 18:30:03 commitfest.postgresql.org is no longer fit for purpose