From: | Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: InstallXLogFileSegment() vs concurrent WAL flush |
Date: | 2024-02-02 11:56:08 |
Message-ID: | 20240202205608.7e8a6a9ed08386d218d73704@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 2 Feb 2024 11:18:18 +0100
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Hi,
>
> New WAL space is created by renaming a file into place. Either a
> newly created file with a temporary name or, ideally, a recyclable old
> file with a name derived from an old LSN. I think there is a data
> loss window between rename() and fsync(parent_directory). A
> concurrent backend might open(new_name), write(), fdatasync(), and
> then we might lose power before the rename hits the disk. The data
> itself would survive the crash, but recovery wouldn't be able to find
> and replay it. That might break the log-before-data rule or forget a
> transaction that has been reported as committed to a client.
>
> Actual breakage would presumably require really bad luck, and I
> haven't seen this happen or anything, it just occurred to me while
> reading code, and I can't see any existing defences.
>
> One simple way to address that would be to make XLogFileInitInternal()
> wait for InstallXLogFileSegment() to finish. It's a little
Or, can we make sure the rename is durable by calling fsync before
returning the fd, as a patch attached here?
Regards,
Yugo Nagata
> pessimistic to do that unconditionally, though, as then you have to
> wait even for rename operations for segment files later than the one
> you're opening, so I thought about how to skip waiting in that case --
> see 0002. I'm not sure if it's worth worrying about or not.
--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Attachment | Content-Type | Size |
---|---|---|
fix_InstallXLogFileSegment_in_another_way.patch | text/x-diff | 535 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2024-02-02 12:11:40 | Re: Checking MINIMUM_VERSION_FOR_WAL_SUMMARIES |
Previous Message | John Naylor | 2024-02-02 11:47:02 | Re: [PoC] Improve dead tuple storage for lazy vacuum |