Quick Links

Re: Checkpoint not retrying failed fsync?

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Checkpoint not retrying failed fsync?
Date:	2018-04-06 00:56:54
Message-ID:	CAEepm=1bVBFXsWkM5w74z6DdM3rdbTYhsVk6P+9fgZwVoyp=Fg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Apr 6, 2018 at 11:36 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Apr 6, 2018 at 11:34 AM, Andrew Gierth
> <andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
>> Right.
>>
>> But I don't think just copying the value is sufficient; if a new bit was
>> set while we were processing the old ones, how would we know which to
>> clear? We couldn't just clear all the bits afterwards because then we
>> might lose a request.
>
> Agreed. The attached draft patch handles that correctly, I think.

After some testing, here is a better one for review. Changes:

1. The copy wasn't really needed.
2. errno needed to be restored (in case bms_union stomped on it),
thanks to Andrew for an off-list complaint about that.
3. Memory context was wrong.
4. bms_first_member() eats its input. Need to use bms_next_member() instead.
5. Mustn't leak in error path (that was a pre-existing bug).

Also here's a patch to test it. If the file /tmp/broken_fsync exists,
you'll see a fake EIO when you CHECKPOINT.

--
Thomas Munro
http://www.enterprisedb.com

Attachment	Content-Type	Size
0001-Make-sure-we-don-t-forget-about-fsync-requests-after.patch	application/octet-stream	2.5 KB
0002-Break-fsyncs-for-testing-not-for-commit.patch	application/octet-stream	911 bytes

In response to

Re: Checkpoint not retrying failed fsync? at 2018-04-05 23:36:39 from Thomas Munro

Responses

Re: Checkpoint not retrying failed fsync? at 2018-04-06 01:11:36 from Thomas Munro
Re: Checkpoint not retrying failed fsync? at 2018-04-08 05:36:00 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2018-04-06 01:11:36	Re: Checkpoint not retrying failed fsync?
Previous Message	Daniel Gustafsson	2018-04-06 00:28:17	Re: Online enabling of checksums