From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, david(at)fetter(dot)org, aidan(at)highrise(dot)ca, stark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: 16-bit page checksums for 9.2 |
Date: | 2012-02-08 03:24:05 |
Message-ID: | 20120208032405.GA5509@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Feb 07, 2012 at 08:58:59PM +0000, Simon Riggs wrote:
> On Thu, Jan 26, 2012 at 8:20 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Wed, Jan 11, 2012 at 10:12:31PM +0000, Simon Riggs wrote:
> > This patch uses FPIs to guard against torn hint writes, even when the
> > checksums are disabled. ?One could not simply condition them on the
> > page_checksums setting, though. ?Suppose page_checksums=off and we're hinting
> > a page previously written with page_checksums=on. ?If that write tears,
> > leaving the checksum intact, that block will now fail verification. ?A couple
> > of ideas come to mind. ?(a) Read the on-disk page and emit an FPI only if the
> > old page had a checksum. ?(b) Add a checksumEnableLSN field to pg_control.
> > Whenever the cluster starts with checksums disabled, set the field to
> > InvalidXLogRecPtr. ?Whenever the cluster starts with checksums enabled and the
> > field is InvalidXLogRecPtr, set the field to the next LSN. ?When a checksum
> > failure occurs in a page with LSN older than the stored one, emit either a
> > softer warning or no message at all.
>
> We can only change page_checksums at restart (now) so the above seems moot.
>
> If we crash with FPWs enabled we repair any torn pages.
There's no live bug, but that comes at a high cost: the patch has us emit
full-page images for hint bit writes regardless of the page_checksums setting.
> > PageSetLSN() is not atomic, so the shared buffer content lock we'll be holding
> > is insufficient.
>
> Am serialising this by only writing PageLSN while holding buf hdr lock.
That means also taking the buffer header spinlock in every PageGetLSN() caller
holding only a shared buffer content lock. Do you think that will pay off,
versus the settled pattern of trading here your shared buffer content lock for
an exclusive one?
> > I can see value in an option to exclude local buffers, since corruption there
> > may be less exciting. ?It doesn't seem important for an initial patch, though.
>
> I'm continuing to exclude local buffers. Let me know if that should change.
Seems reasonable.
Thanks,
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-02-08 03:38:13 | Re: Vacuum rate limit in KBps |
Previous Message | Joachim Wieland | 2012-02-08 03:21:04 | Re: patch for parallel pg_dump |