Quick Links

Re: Server crash (FailedAssertion) due to catcache refcount mis-handling

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Server crash (FailedAssertion) due to catcache refcount mis-handling
Date:	2017-08-08 15:36:17
Message-ID:	4244.1502206577@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> writes:
> We have observed a random server crash (FailedAssertion), while running few
> tests at our end. Stack-trace is attached.

> By looking at the stack-trace, and as discussed it with my team members;
> what we have observed that in SearchCatCacheList(), we are incrementing
> refcount and then decrementing it at the end. However for some reason, if
> we are in TRY() block (where we increment the refcount), and hit with any
> interrupt, we failed to decrement the refcount due to which later we get
> assertion failure.

Hm. So SearchCatCacheList has a PG_TRY block that is meant to release
those refcounts, but if you hit the backend with a SIGTERM while it's
in that function, control goes out through elog(FATAL) which doesn't
execute the PG_CATCH cleanup. But it does do AbortTransaction which
calls AtEOXact_CatCache, and that is expecting that all the cache
refcounts have reached zero.

We could respond to this by using PG_ENSURE_ERROR_CLEANUP there instead
of plain PG_TRY. But I have an itchy feeling that there may be a lot
of places with similar issues. Should we be revisiting the basic way
that elog(FATAL) works, to make it less unlike elog(ERROR)?

regards, tom lane

In response to

Server crash (FailedAssertion) due to catcache refcount mis-handling at 2017-08-08 12:34:56 from Jeevan Chalke

Responses

Re: Server crash (FailedAssertion) due to catcache refcount mis-handling at 2017-08-08 15:54:26 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	amul sul	2017-08-08 15:45:38	Re: reload-through-the-top-parent switch the partition table
Previous Message	Robert Haas	2017-08-08 14:49:52	Re: pl/perl extension fails on Windows