Quick Links

Re: [BUGS] bug or simply not enough stack space?

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject:	Re: [BUGS] bug or simply not enough stack space?
Date:	2019-10-04 16:06:34
Message-ID:	14790.1570205194@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> Reviving this ancient thread. I saw "did not find subXID" errors, in
> 9.6.12. Here is what happened.

> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: LOG: could not send data to client: Broken
> pipe
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: STATEMENT: select
> LoadHistoryDataFromYSM_testing2();
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: STATEMENT: select
> LoadHistoryDataFromYSM_testing2();

[ and then we get into recursive error-during-error-cleanup failures ]

Yeah, something has left stuff in a bad state here.

> *) "loadhistorydatafromysm_testing2()" is using pl/sh, which is a
> known source of weird (but rare) instability issues (I'm assuming this
> is underlying cause of issue)

Hm. Yeah, I'd be way more interested if this could be reproduced
without pl/sh.

> I can't help but wonder if we have some kind of obscure issue that is
> related to C extension problems; just throwing a data point on the
> table.

Well, there's nothing too obscure about the rule that error cleanup
needs to avoid doing anything that might cause another error, for fear
of causing infinite recursion. I suspect that the underlying issue is
that pl/sh is violating that rule somewhere. The other thread you point
to suggests that maybe oracle_fdw also used to do that, and fixed it.

regards, tom lane

In response to

Re: [BUGS] bug or simply not enough stack space? at 2019-10-04 14:12:38 from Merlin Moncure

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	PG Bug reporting form	2019-10-04 19:28:28	BUG #16039: PANIC when activating replication slots in Postgres 12.0 64bit under Windows
Previous Message	Andres Freund	2019-10-04 15:26:24	Re: BUG #16038: Alter table - SegFault