From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | jimmy <mpokky(at)126(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Re: Bug: ERROR: invalid cache ID: 42 CONTEXT: parallel worker |
Date: | 2018-08-22 04:47:30 |
Message-ID: | CAEepm=30uOeesrmZWBj6zFh-E2hByJseyoM2ZtUS2r0E5G9zyA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Aug 22, 2018 at 2:54 PM, jimmy <mpokky(at)126(dot)com> wrote:
> This is the debug log below. Is it useful. Thank you.
That's not showing the path that reaches the error. If it's happening
in a parallel worker, that'll probably be tricky to catch with a
breakpoint. Are you able to recompile PostgreSQL? If you could do
that after changing all cases of elog(ERROR, "invalid cache ID: %d",
cacheId) to PANIC instead of ERROR, and then start it with ulimit -c
unlimited, you might get a core file that you can load into a debugger
to see how we reached it.
It's a strange error. I don't think it can be coming from these
places in inval.c:
if (cacheid < 0 || cacheid >= SysCacheSize)
elog(ERROR, "invalid cache ID: %d", cacheid);
... because we can see that it's 42 (PROCNAMEARGSNSP, a valid cache
ID), and SysCacheSize is a compile-time constant greater than 42. So
it must be coming from one of the places in syscache.c that look like
this:
if (cacheId < 0 || cacheId >= SysCacheSize ||
!PointerIsValid(SysCache[cacheId]))
elog(ERROR, "invalid cache ID: %d", cacheId);
Since InitCatalogCache() puts a non-NULL pointer into every index from
0 to SysCacheSize - 1 without gaps (or it errors out if it fails while
trying), it seems like either InitCatalogCache() didn't run, or
SysCache[42] has later been overwritten with NULL? I wondered if
there is some way for a parallel worker to reach shared invalidation
message processing code before the InitCatalogCache() has run, but
that doesn't seem to be an issue: SysCacheInvalidate() quietly
tolerates that.
I wonder how we could reach one of SearchSysCache(PROCNAMEARGSNSP,
...), SysCacheGetAttr(PROCNAMEARGSNSP, ...),
GetSysCacheHashValue(PROCNAMEARGSNSP, ...),
SearchSysCacheList(PROCNAMEARGSNSP, ...) before InitCatalogCache() has
finished? The answer probably involves oracle_fdw.
Ahh, how about this line here:
https://github.com/laurenz/oracle_fdw/blob/master/oracle_fdw.c#L6237
catlist = SearchSysCacheList2(
PROCNAMEARGSNSP,
CStringGetDatum("geometry_recv"),
PointerGetDatum(buildoidvector(argtypes, argcount)));
I don't immediately see how that can be reached before
InitCatalogCache() has run, though.
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2018-08-22 05:08:36 | BUG #15345: pg_upgrade from 9.6.10 to 10.5 fails due to function call in index definition |
Previous Message | jimmy | 2018-08-22 02:54:27 | Re:Re: Bug: ERROR: invalid cache ID: 42 CONTEXT: parallel worker |