Re: Enable data checksums by default

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christoph Berg <myon(at)debian(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable data checksums by default
Date: 2019-03-22 17:01:32
Message-ID: 12110cc1-3729-9e5e-b6bb-62151a68af29@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/22/19 5:41 PM, Andres Freund wrote:
> Hi,
>
> On 2019-03-22 17:32:10 +0100, Tomas Vondra wrote:
>> On 3/22/19 5:10 PM, Andres Freund wrote:
>>> IDK, being able to verify in some form that backups aren't corrupted on
>>> an IO level is mighty nice. That often does allow to detect the issue
>>> while one still has older backups around.
>>>
>>
>> Yeah, I agree that's a valuable capability. I think the question is how
>> effective it actually is considering how much the storage changed over
>> the past few years (which necessarily affects the type of failures
>> people have to deal with).
>
> I'm not sure I understand? How do the changes around storage
> meaningfully affect the need to have some trust in backups and
> benefiting from earlier detection?
>

Having trusted in backups is still desirable - nothing changes that,
obviously. The question I was posing was rather "Are checksums still
effective on current storage systems?"

I'm wondering if the storage systems people use nowadays may be failing
in ways that are not reliably detectable by checksums. I don't have any
data to either support or reject that hypothesis, though.

>
>> It's not clear to me what can checksums do about zeroed pages (and/or
>> truncated files) though.
>
> Well, there's nothing fundamental about needing added pages be
> zeroes. We could expand them to be initialized with actual valid
> checksums instead of
> /* new buffers are zero-filled */
> MemSet((char *) bufBlock, 0, BLCKSZ);
> /* don't set checksum for all-zero page */
> smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);
>
> the problem is that it's hard to do so safely without adding a lot of
> additional WAL logging. A lot of filesystems will journal metadata
> changes (like the size of the file), but not contents. So after a crash
> the tail end might appear zeroed out, even if we never wrote
> zeroes. That's obviously solvable by WAL logging, but that's not cheap.
>

Hmmm. I'd say a filesystem that does not guarantee having all the data
after an fsync is outright broken, but maybe that's what checksums are
meant to protect against.

> It might still be a good idea to just write a page with an initialized
> header / checksum at that point, as that ought to still detect a number
> of problems we can't detect right now.
>

Sounds reasonable.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-03-22 17:07:15 Re: Enable data checksums by default
Previous Message Alvaro Herrera 2019-03-22 16:52:09 Re: propagating replica identity to partitions