From: | Greg Stark <stark(at)mit(dot)edu> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enabling Checksums |
Date: | 2013-04-16 17:45:39 |
Message-ID: | CAM-w4HMkUBaGo1jQCAYJyRFnOju1yioz6Z7QrpSTawvk7EiapQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 12, 2013 at 9:42 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> * WAL checksum is not used as the sole basis for end-of-WAL discovery.
> We reuse the WAL files, so the prev field in each WAL record shows
> what the previous end of WAL was. Hence if the WAL checksums give a
> false positive we still have a double check that the data really is
> wrong. It's unbelievable that you'd get a false positive and then have
> the prev field match as well, even though it was the genuine
> end-of-WAL.
This is kind of true and kind of not true. If a system loses power
while writing lots of data to WAL then the blocks at the end of the
WAL might not be written out in order. Everything since the last log
sync might be partly written out and partly not written out. That's
the case where the checksum is critical. The beginning of a record
could easily be written out including xl_prev and the end of the
record not written. 1/64,000 power losses would then end up with an
assertion failure or corrupt database.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2013-04-16 18:53:32 | Re: event trigger API documentation? |
Previous Message | Simon Riggs | 2013-04-16 15:43:59 | Re: COPY and Volatile default expressions |