Re: race condition when writing pg_control

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: race condition when writing pg_control
Date: 2024-07-15 03:44:48
Message-ID: CA+hUKGJ2Mfp1s+nKk1rxQytB0p2OVxUUiYTpi4S3a7UX862K5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 12, 2024 at 11:43 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Sat, May 18, 2024 at 05:29:12PM +1200, Thomas Munro wrote:
> > On Fri, May 17, 2024 at 4:46 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > The specific problem here is that LocalProcessControlFile() runs in
> > > every launched child for EXEC_BACKEND builds. Windows uses
> > > EXEC_BACKEND, and Windows' NTFS file system is one of the two file
> > > systems known to this list to have the concurrent read/write data
> > > mashing problem (the other being ext4).
>
> > First idea idea I've come up with to avoid all of that: pass a copy of
> > the "proto-controlfile", to coin a term for the one read early in
> > postmaster startup by LocalProcessControlFile(). As far as I know,
> > the only reason we need it is to suck some settings out of it that
> > don't change while a cluster is running (mostly can't change after
> > initdb, and checksums can only be {en,dis}abled while down). Right?
> > Children can just "import" that sucker instead of calling
> > LocalProcessControlFile() to figure out the size of WAL segments yada
> > yada, I think? Later they will attach to the real one in shared
> > memory for all future purposes, once normal interlocking is allowed.
>
> I like that strategy, particularly because it recreates what !EXEC_BACKEND
> backends inherit from the postmaster. It might prevent future bugs that would
> have been specific to EXEC_BACKEND.

Thanks for looking! Yeah, that is a good way to put it.

The only other idea I can think of is that the Postmaster could take
all of the things that LocalProcessControlFile() wants to extract from
the file, and transfer them via that struct used for EXEC_BACKEND as
individual variables, instead of this new proto-controlfile copy. I
think it would be a bigger change with no obvious-to-me additional
benefit, so I didn't try it.

> > I dunno. Draft patch attached. Better plans welcome. This passes CI
> > on Linux systems afflicted by EXEC_BACKEND, and Windows. Thoughts?
>
> Looks reasonable. I didn't check over every detail, given the draft status.

I'm going to upgrade this to a proposal:

https://commitfest.postgresql.org/49/5124/

I wonder how often this happens in the wild.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2024-07-15 04:33:33 Re: MERGE/SPLIT partition commands should create new partitions in the parent's tablespace?
Previous Message Thomas Munro 2024-07-15 03:26:32 Re: Confine vacuum skip logic to lazy_scan_skip