Re: fsync-pgdata-on-recovery tries to write to more files than previously

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Christoph Berg <myon(at)debian(dot)org>
Subject: Re: fsync-pgdata-on-recovery tries to write to more files than previously
Date: 2015-05-26 15:43:32
Message-ID: 556494A4.9020002@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 05/26/2015 08:05 AM, Robert Haas wrote:
> On Mon, May 25, 2015 at 9:54 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I certainly see your point, but Tom also pointed out that it's not great
>> to ignore failures during this phase:
>>
>> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>>> Greg Stark <stark(at)mit(dot)edu> writes:
>>>> What exactly is failing?
>>>> Is it that fsync is returning -1 ?
>>> According to the original report from Christoph Berg, it was open()
>>> not fsync() that was failing, at least in permissions-based cases.
>>>
>>> I'm not sure if we should just uniformly ignore all failures in this
>>> phase. That would have the merit of clearly not creating any new
>>> startup failure cases compared to the previous code, but as you say
>>> sometimes it might mean ignoring real problems.
>> If we accept this, then we still have to have the lists, to decide what
>> to fail on and what to ignore. If we're going to have said lists tho, I
>> don't really see the point in fsync'ing things we're pretty confident
>> aren't ours.
> No, that's not right at all. The idea, at least as I understand it,
> would be decide which errors to ignore by looking at the error code,
> not by looking at which file is involved.
>

OK, I'm late to the party. But why exactly are we syncing absolutely
everything? That seems over-broad.

And might it be better to check that we can open each file using
access() than calling open() and looking at the error code?

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Smith 2015-05-26 15:47:15 Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound
Previous Message Alvaro Herrera 2015-05-26 15:04:01 Re: "unaddressable bytes" in BRIN