| From: | Noah Misch <noah(at)leadboat(dot)com> | 
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: ERROR during end-of-xact/FATAL | 
| Date: | 2013-11-08 21:13:43 | 
| Message-ID: | 20131108211343.GA792232@tornado.leadboat.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed, Nov 06, 2013 at 10:14:53AM +0530, Amit Kapila wrote:
> On Thu, Oct 31, 2013 at 8:22 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > If the original AbortTransaction() pertained to a FATAL, the situation is
> > worse.  errfinish() promotes the ERROR thrown from AbortTransaction() to
> > another FATAL,
> 
> isn't errstart promotes ERROR to FATAL?
Right.
> When I tried above scenario, I hit Assert at different place
...
> This means that in the situation when an ERROR occurs in
> AbortTransaction which is called as a result of FATAL error, there are
> many more possibilities of Assert.
Agreed.
> About unclean FATAL-then-ERROR scenario, one way to deal at high level
> could be to treat such a case as backend crash in which case
> postmaster reinitialises shared memory and other stuff.
> 
> > If we can't manage to
> > free a shared memory resource like a lock or buffer pin, we really must PANIC.
> 
> Can't we try to initialise the shared memory and other resources,
> wouldn't that resolve the problem's that can occur due to scenario
> explained by you?
A PANIC will reinitialize everything relevant, largely resolving the problems
around ERROR during FATAL.  It's a heavy-handed solution, but it may well be
the best solution.  Efforts to harden CommitTransaction() and
AbortTransaction() seem well-spent, but the additional effort to make FATAL
exit cope where AbortTransaction() or another exit action could not cope seems
to be slicing ever-smaller portions of additional robustness.
I pondered a variant of that conclusion that distinguished critical cleanup
needs from the rest.  Each shared resource (heavyweight locks, buffer pins,
LWLocks) would have an on_shmem_exit() callback that cleans up the resource
under a critical section.  (AtProcExit_Buffers() used to fill such a role, but
resowner.c's work during AbortTransaction() has mostly supplanted it.)  The
ShutdownPostgres callback would not use a critical section, so lesser failures
in AbortTransaction() would not upgrade to a PANIC.  But I'm leaning against
such a complication on the grounds that it would add seldom-tested code paths
posing as much a chance of eroding robustness as bolstering it.
Thanks,
nm
-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Kevin Grittner | 2013-11-08 21:19:58 | Re: pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value | 
| Previous Message | Alvaro Herrera | 2013-11-08 21:04:38 | Re: Minmax indexes |