Re: Problem while setting the fpw with SIGHUP

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: michael(at)paquier(dot)xyz, robertmhaas(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, dilipbalaut(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem while setting the fpw with SIGHUP
Date: 2018-04-13 08:34:31
Message-ID: 20180413.173431.121222756.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry, the patch attached to the previous main is slightly
old. The attached is the correct one.

# They differ only in some phrase in a comment.

====
At Fri, 13 Apr 2018 17:28:40 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20180413(dot)172840(dot)228724367(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
At Fri, 13 Apr 2018 13:47:51 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20180413(dot)134751(dot)76149471(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> At Fri, 13 Apr 2018 08:31:02 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in <CAA4eK1LVFpLf=d-7XmfwhLv7Xu53pU0bGU=wVrYWSRU4XSsyHQ(at)mail(dot)gmail(dot)com>
> > On Fri, Apr 13, 2018 at 6:59 AM, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > > On Thu, Apr 12, 2018 at 02:55:53PM -0400, Robert Haas wrote:
> > >> I think it may actually be confusing. If you run pg_ctl reload and it
> > >> reports that the value has changed, you'll expect it to have taken
> > >> effect. But really, it will take effect at some later time.
> > >
> >
> > +1. I also think it is confusing and it could be difficult for end
> > users to know when the setting is effective.
> >
> > > It is true that sometimes some people like to temporarily disable
> > > full_page_writes particularly when doing some bulk load of data to
> > > minimize the effort on WAL, and then re-enable it just after doing
> > > the inserting this data.
> > >
> > > Still does it matter when the change is effective? By disabling
> > > full_page_writes even temporarily, you accept the fact that this
> > > instance would not be safe until the next checkpoint completes. The
> > > instance even finishes by writing less unnecessary WAL data if the
> > > change is only effective at the next checkpoint. Well, it is true that
> > > this increases potential torn pages problems but the user is already
> > > accepting that risk if a crash happens until the next checkpoint then it
> > > exposes itself to torn pages anyway as it chose to disable
> > > full_page_writes.
>
> I still don't think that enabling FPW anytime is useful but
> disabling seems useful as I mentioned upthread.
>
> The problem was checkpointer changes the flag anytime including
> recovery time. Startup process updates the same flag at the end
> of recovery but before publicated. Letting checkpointer change
> the flag only at checkpoint time is a straightforward way to
> avoid conflicts with startup process.
>
> I reconsider a bit and came up with the thought that we could
> just skip changing shared FPW in checkpointer until recovery
> ends, then update the flag after recovery end (perhaps at
> checkpoint time in major cases). In this case, FPI is attached
> from REDO point of the first checkpoint (not restartpoint) or a
> bit earlier, then FPW can be flipped at any time. I'll come up
> with that soon.

Please find the attached. The most significant change is that
UpdateSharedMemoryConfig skips updating of shared fullPageWrites
during recovery. The original crash is fixed since this
guarantees that XLog working area is initializeed before reaching
UpdateFullPageWrites().

Addition to that, I changed CheckpointerMain so that it tries
update of shared FPW regardless of SIGHUP and provided new
function to just wakeup checkpointer. StartupXLOG wakes up
checkpointer either checkpoint is required or not and
checkpointer makes the first update of shared FPW at the time.

After this point, everything works as the same as the current
behavior.

> > I think this means that is will be difficult for end users to predict
> > unless they track the next checkpoint which isn't too bad, but won't
> > be convenient either.
>
> Looking checkpiont record is enough to know wheter the checkpoint
> is protected by FPW eough, but I agree that such strictness is
> not crutial.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Inhibit-update-shared-FPW-during-recovery.patch text/x-patch 5.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2018-04-13 08:45:35 Re: wal_consistency_checking reports an inconsistency on master branch
Previous Message Kyotaro HORIGUCHI 2018-04-13 08:28:40 Re: Problem while setting the fpw with SIGHUP