From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Christoph Berg <myon(at)debian(dot)org> |
Subject: | Re: fsync-pgdata-on-recovery tries to write to more files than previously |
Date: | 2015-05-26 15:43:32 |
Message-ID: | 556494A4.9020002@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/26/2015 08:05 AM, Robert Haas wrote:
> On Mon, May 25, 2015 at 9:54 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> I certainly see your point, but Tom also pointed out that it's not great
>> to ignore failures during this phase:
>>
>> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>>> Greg Stark <stark(at)mit(dot)edu> writes:
>>>> What exactly is failing?
>>>> Is it that fsync is returning -1 ?
>>> According to the original report from Christoph Berg, it was open()
>>> not fsync() that was failing, at least in permissions-based cases.
>>>
>>> I'm not sure if we should just uniformly ignore all failures in this
>>> phase. That would have the merit of clearly not creating any new
>>> startup failure cases compared to the previous code, but as you say
>>> sometimes it might mean ignoring real problems.
>> If we accept this, then we still have to have the lists, to decide what
>> to fail on and what to ignore. If we're going to have said lists tho, I
>> don't really see the point in fsync'ing things we're pretty confident
>> aren't ours.
> No, that's not right at all. The idea, at least as I understand it,
> would be decide which errors to ignore by looking at the error code,
> not by looking at which file is involved.
>
OK, I'm late to the party. But why exactly are we syncing absolutely
everything? That seems over-broad.
And might it be better to check that we can open each file using
access() than calling open() and looking at the error code?
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Smith | 2015-05-26 15:47:15 | Re: ERROR: MultiXactId xxxx has not been created yet -- apparent wraparound |
Previous Message | Alvaro Herrera | 2015-05-26 15:04:01 | Re: "unaddressable bytes" in BRIN |