From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, pryzby(at)telsasoft(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)" |
Date: | 2018-07-18 03:42:20 |
Message-ID: | 20180718.124220.08087886.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello. I confirmed that this patch fixes the crash.
At Tue, 17 Jul 2018 20:01:05 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in <14892(dot)1531872065(at)sss(dot)pgh(dot)pa(dot)us>
> I wrote:
> >> So I said I didn't want to do extra work on this, but I am looking into
> >> fixing it by having these aux process types run a ResourceOwner that can
> >> be told to clean up any open buffer pins at exit.
>
> > That turned out to be a larger can of worms than I'd anticipated, as it
> > soon emerged that we'd acquired a whole lot of cargo-cult programming
> > around ResourceOwners. ...
> > I'm mostly pretty happy with this, but I think there are a couple of
> > loose ends in logicalfuncs.c and slotfuncs.c: those are creating
> > non-standalone ResourceOwners (children of whatever the active
> > ResourceOwner is) and doing nothing much to clean them up. That seems
> > pretty bogus.
>
> Further investigation showed that the part of that code that was
> actually needed was not the creation of a child ResourceOwner, but rather
> restoration of the old CurrentResourceOwner setting after exiting the
> logical decoding loop. Apparently we can come out of that with the
> TopTransaction resowner being active. This still seems a bit bogus;
> maybe there should be a save-and-restore happening somewhere else?
> But I'm not really excited about doing more than commenting it.
CurrentResourceOwner doesn't seem to be changed. It is just saved
during snapshot export and used just as a flag only for checking
for duplicate snapshot exporting.
> Also, most of the other random creations of ResourceOwners seem to just
> not be necessary at all, even with the new rule that you must have a
> resowner to acquire buffer pins. So the attached revised patch just
> gets rid of them, and improves some misleading/wrong comments on the
> topic. It'd still be easy to use CreateAuxProcessResourceOwner in any
> process where we discover we need one, but I don't see the value in adding
> useless overhead.
+1 for unifying to resowner for auxiliary processes. I found the
comment below just before ending cleanup of auxiliary process
main funcs.
| * These operations are really just a minimal subset of
| * AbortTransaction(). We don't have very many resources to worry
| * about in checkpointer, but we do have LWLocks, buffers, and temp
| * files.
So couldn't we use TopTransactionResourceOwner instead of
AuxProcessResrouceOwner? I feel a bit uneasy that bootstrap and
standalone-backend have *AuxProcess*ResourceOwner.
It's not about this ptch, but while looking this closer, I found
the following comment on ShutdownXLOG().
| /*
| * This must be called ONCE during postmaster or standalone-backend shutdown
| */
Is the "postmaster" typo of "bootstrap process"? It is also
called from checkpointer when non-standlone mode.
> At this point I'm leaning to just applying this in HEAD and calling it
> good. The potential for assertion failures isn't relevant to production
> builds, and my best guess at this point is that there isn't really any
> other user-visible bug. The resowners that were being created and not
> adequately cleaned up all seem to have been dead code anyway (ie they'd
> never acquire any resources).
Agreed.
> We could have a problem if, say, the
> bgwriter exited via the FATAL path while holding a pin, but I don't think
> there's a reason for it to do that except in a database-wide shutdown,
> where the consequences of a leaked pin seem pretty minimal.
>
> Any objections? Anyone want to do further review?
FWIW I think we won't be concerned about leaked pins after FATAL.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2018-07-18 04:02:35 | Re: missing toast table for pg_policy |
Previous Message | Craig Ringer | 2018-07-18 03:12:06 | Re: [bug fix] Produce a crash dump before main() on Windows |