From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, Frank van Vugt <ftm(dot)van(dot)vugt(at)foxi(dot)nl>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: [BUGS] bug or simply not enough stack space? |
Date: | 2019-10-04 16:06:34 |
Message-ID: | 14790.1570205194@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> Reviving this ancient thread. I saw "did not find subXID" errors, in
> 9.6.12. Here is what happened.
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: WARNING: did not find subXID 384134 in MyProc
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: LOG: could not send data to client: Broken
> pipe
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: STATEMENT: select
> LoadHistoryDataFromYSM_testing2();
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: ERROR: failed to re-find shared lock object
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: CONTEXT: PL/pgSQL function
> loadhistorydatafromysm_testing2() line 99 during exception cleanup
> 2019-10-03 19:58:37 CDT [10.22.236.83: rms(at)cds2]
> [10.22.236.83(54943)]: STATEMENT: select
> LoadHistoryDataFromYSM_testing2();
[ and then we get into recursive error-during-error-cleanup failures ]
Yeah, something has left stuff in a bad state here.
> *) "loadhistorydatafromysm_testing2()" is using pl/sh, which is a
> known source of weird (but rare) instability issues (I'm assuming this
> is underlying cause of issue)
Hm. Yeah, I'd be way more interested if this could be reproduced
without pl/sh.
> I can't help but wonder if we have some kind of obscure issue that is
> related to C extension problems; just throwing a data point on the
> table.
Well, there's nothing too obscure about the rule that error cleanup
needs to avoid doing anything that might cause another error, for fear
of causing infinite recursion. I suspect that the underlying issue is
that pl/sh is violating that rule somewhere. The other thread you point
to suggests that maybe oracle_fdw also used to do that, and fixed it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2019-10-04 19:28:28 | BUG #16039: PANIC when activating replication slots in Postgres 12.0 64bit under Windows |
Previous Message | Andres Freund | 2019-10-04 15:26:24 | Re: BUG #16038: Alter table - SegFault |