Re: race condition when writing pg_control

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: race condition when writing pg_control
Date: 2024-05-16 19:07:44
Message-ID: 20240516190744.6beu43cjdvrml6uw@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-05-16 15:01:31 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
> >> The intention was certainly always that it be atomic. If it isn't
> >> we have got *big* trouble.
>
> > We unfortunately do *know* that on several systems e.g. basebackup can read a
> > partially written control file, while the control file is being
> > updated.
>
> Yeah, but can't we just retry that if we get a bad checksum?

Retry what/where precisely? We can avoid the issue in basebackup.c by taking
ControlFileLock in the right moment - but that doesn't address
pg_start/stop_backup based backups. Hence the patch in the referenced thread
moving to replacing the control file by atomic-rename if there are base
backups ongoing.

> What had better be atomic is the write to disk.

That is still true to my knowledge.

> Systems that can't manage POSIX semantics for concurrent reads and writes
> are annoying, but not fatal ...

I think part of the issue that people don't agree what posix says about
a read that's concurrent to a write... See e.g.
https://utcc.utoronto.ca/~cks/space/blog/unix/WriteNotVeryAtomic

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2024-05-16 19:11:14 Re: commitfest.postgresql.org is no longer fit for purpose
Previous Message Tom Lane 2024-05-16 19:01:31 Re: race condition when writing pg_control