From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: avoid multiple hard links to same WAL file after a crash |
Date: | 2022-05-02 23:06:13 |
Message-ID: | 20220502230613.GA3398932@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 02, 2022 at 10:39:07AM -0700, Nathan Bossart wrote:
> On Mon, May 02, 2022 at 07:48:18PM +0900, Michael Paquier wrote:
>> The WAL receiver upgrades the ERROR to a FATAL, and restarts
>> streaming shortly after. Using durable_rename() would not be an issue
>> here.
>
> Thanks for investigating this one. I think I agree that we should simply
> switch to durable_rename() (without a file existence check beforehand).
Here is a new patch set. For now, I've only removed the file existence
check in writeTimeLineHistoryFile(). I don't know if I'm totally convinced
that there isn't a problem here (e.g., due to concurrent .ready file
creation), but since some platforms have been using rename() for some time,
I don't know how worried we should be. I thought about adding some kind of
locking between the WAL receiver and startup processes, but that seems
excessive. Alternatively, we could just fix xlog.c as proposed earlier
[0]. AFAICT that is the only caller that can experience problems due to
the multiple-hard-link issue. All other callers are simply renaming a
temporary file into place, and the temporary file can be discarded if left
behind after a crash.
[0] https://postgr.es/m/20220407182954.GA1231544%40nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Replace-calls-to-durable_rename_excl-with-durable.patch | text/x-diff | 4.9 KB |
v5-0002-Remove-durable_rename_excl.patch | text/x-diff | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Christensen | 2022-05-02 23:44:45 | Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL |
Previous Message | Tom Lane | 2022-05-02 23:02:07 | Re: strange slow query - lost lot of time somewhere |