From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enabling Checksums |
Date: | 2013-04-16 19:45:41 |
Message-ID: | CA+U5nMKnQFgkgNFbOBAe58PiPD_HnEc5GpGv9KZJqRE3n94Dpg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 16 April 2013 20:27, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <stark(at)mit(dot)edu> writes:
>> On Fri, Apr 12, 2013 at 9:42 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> * WAL checksum is not used as the sole basis for end-of-WAL discovery.
>>> We reuse the WAL files, so the prev field in each WAL record shows
>>> what the previous end of WAL was. Hence if the WAL checksums give a
>>> false positive we still have a double check that the data really is
>>> wrong. It's unbelievable that you'd get a false positive and then have
>>> the prev field match as well, even though it was the genuine
>>> end-of-WAL.
>
>> This is kind of true and kind of not true. If a system loses power
>> while writing lots of data to WAL then the blocks at the end of the
>> WAL might not be written out in order. Everything since the last log
>> sync might be partly written out and partly not written out. That's
>> the case where the checksum is critical. The beginning of a record
>> could easily be written out including xl_prev and the end of the
>> record not written. 1/64,000 power losses would then end up with an
>> assertion failure or corrupt database.
>
> I have a hard time believing that it's a good idea to add checksums to
> data pages and at the same time weaken our ability to detect WAL
> corruption. So this seems to me to be going in the wrong direction.
> What's it buying for us anyway? A few CPU cycles saved during WAL
> generation? That's probably swamped by the other costs of writing WAL,
> especially if you're using replication.
This part of the thread is dead now .... I said "lets drop this idea"
on 13 April.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Florian Pflug | 2013-04-16 20:20:26 | Re: Enabling Checksums |
Previous Message | Tom Lane | 2013-04-16 19:27:46 | Re: Enabling Checksums |