From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: WAL consistency check facility |
Date: | 2017-02-10 01:17:06 |
Message-ID: | CAB7nPqRzCQb=vdfHvMtP0HMLBHU6z1aGdo4GJsUP-HP8jx+Pkw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 9, 2017 at 5:56 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Feb 8, 2017 at 1:25 AM, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>> Thank you Robert for the review. Please find the updated patch in the
>> attachment.
>
> I have committed this patch after fairly extensive revisions:
Cool. I had finally a look at what has been committed in a507b869.
Running regression tests with all RMGRs enabled, a single installcheck
generates 7GB of WAL. Woah.
> * Rewrote the documentation to give some idea what the underlying
> mechanism of operation of the feature is, so that users who choose to
> enable this will hopefully have some understanding of what they've
> turned on.
Thanks, those look good to me.
> * Renamed "char *page" arguments to "char *pagedata" and "Page page",
> because tempPage doesn't seem to be to be any better a name than
> page_norm.
> * Removed assertion in checkXLogConsistency that consistency checking
> is enabled for the RM; that's trivially false if
> wal_consistency_checking is not the same on the master and the
> standby. (Note that quite apart from the issue of whether this
> function should exist at all, adding it to a header file after the
> closing #endif guard is certainly not right.)
I recall doing those two things the same way as in the commit. Not
sure at which point they have been re-introduced.
> * Changed checkXLogConsistency to skip pages whose LSN is newer than
> that of the record. Without this, if you shut down recovery and
> restart it, it complains of inconsistent pages and dies. (I'm not
> sure this is the only scenario that needs to be covered; it would be
> smart to do more testing of restarting the standby.)
Good point.
> -- a WAL consistency failure causes your standby to die a hard death.
> (Maybe there should be a way to suppress consistency checking on the
> standby -- but I think not just by requiring wal_consistency_checking
> on both ends. Or maybe we should just downgrade the FATAL to WARNING;
> blowing up the standby irrevocably seems like poor behavior.)
Having a FATAL is useful for buildfarm members, that would show up in
red. Having a switch to generate a warning would be useful for live
deployments I agree. Now I think that we need as well two things:
- A recovery test to run regression tests with a standby behind.
- Extend the TAP tests so as it is possible to fill in postgresql.conf
with custom variables.
- have the buildfarm client run recovery tests!
I am fine to write those patches.
> I also bumped XLOG_PAGE_MAGIC (which is normally done by the
> committer, not the patch author, so I wasn't expecting that to be in
> the patch as submitted).
Here are a couple of things I have noticed while looking at the code.
+ * Portions Copyright (c) 2016, PostgreSQL Global Development Group
s/2016/2017/ in bufmask.c and bufmask.h.
+ if (ItemIdIsNormal(iid))
+ {
+
+ HeapTupleHeader page_htup = (HeapTupleHeader) page_item;
Unnecessary newline here.
+ * Read the contents from the backup copy, stored in WAL record and
+ * store it in a temporary page. There is not need to allocate a new
+ * page here, a local buffer is fine to hold its contents and a mask
+ * can be directly applied on it.
s/not need/no need/.
In checkXLogConsistency(), FPWs that have the flag BKPIMAGE_APPLY set
will still be checked, resulting in a FPW being compared to itself. I
think that those had better be bypassed.
Please find attached a patch with those fixes.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
consistency-checks-fix.patch | application/octet-stream | 2.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-02-10 01:19:56 | Re: [PATCH] Rename pg_switch_xlog to pg_switch_wal |
Previous Message | Stephen Frost | 2017-02-10 01:16:47 | Re: [PATCH] Rename pg_switch_xlog to pg_switch_wal |