From: | Chris Mair <chris(at)1006(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | parallel query in 9.6.6 + oracle_fdw -> ERROR: invalid cache ID: 41 |
Date: | 2017-11-19 18:18:04 |
Message-ID: | 96420364a3d055172776752a1de80714@smtp.hushmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi!
I've found a problem in either PostgreSQL 9.6.6 or oracle_fdw 2.0.0.
Background:
I was testing a setup on a current CentOS 7 system with PostgreSQL 9.6.6
installed from the PGDG repositories and oracle_fdw 2.0.0 installed via pgxn
install. Everything went absolutely fine until I enabled the parallel query
feature (I set max_worker_processes and max_parallel_workers_per_gather both to
8).
Bug:
Whenever a session has performed a query on a foreign table, any subsequent
query on a local table big enough to use the parallel query feature exits with
an error:
ERROR: invalid cache ID: 41
CONTEXT: parallel worker
Whenever a session has NOT performed a query on a foreign table, parallel
queries are executed all OK. No problem.
The problem is not related to what kind of foreign query was executed.
Reproducability is 100%!
Some debugging work:
I've seen this was reported some time ago here:
https://www.postgresql.org/message-id/29113.1488910563%40sss.pgh.pa.us
I've followed Tom Lanes advice and did - on the same machine - a debug enabled
build of the 9.6.6 source code with
./configure --enable-cassert --enable-debug CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer" --prefix=/home/centos/pgbuild
and changed the ERROR in PANIC in src/backend/utils/cache/syscache.c.
I then redid the pgxn install oracle_fdw with paths pointing to the debug
build. The problem was easily reproduced. A count(*) on a bigish table with 4
workers produced 4 core dumps!
Below is the back trace from gdb from each of the cores.
Let me know if there's anything else I can do or check!
Thanks :)
Chris.
[centos(at)asia data]$ ls -l core*
-rw-------. 1 centos centos 152059904 Nov 19 17:43 core.30430
-rw-------. 1 centos centos 152059904 Nov 19 17:43 core.30431
-rw-------. 1 centos centos 152059904 Nov 19 17:43 core.30432
-rw-------. 1 centos centos 152059904 Nov 19 17:43 core.30433
[centos(at)asia data]$ gdb ../bin/postgres core.30430
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/centos/pgbuild/bin/postgres...done.
[New LWP 30430]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: bgworker: parallel worke'.
Program terminated with signal 6, Aborted.
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 libaio-0.3.109-13.el7.x86_64
(gdb) bt
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
#1 0x00007f16a0f4e8e8 in abort () from /lib64/libc.so.6
#2 0x00000000008094b4 in errfinish (dummy=dummy(at)entry=0) at elog.c:557
#3 0x000000000080aea2 in elog_finish (elevel=elevel(at)entry=22, fmt=fmt(at)entry=0x9d9965 "invalid cache ID: %d") at elog.c:1378
#4 0x00000000007ffd82 in SearchSysCacheList (cacheId=cacheId(at)entry=41, nkeys=nkeys(at)entry=2, key1=key1(at)entry=139734905138463, key2=<optimized out>, key3=key3(at)entry=0, key4=key4(at)entry=0) at syscache.c:1210
#5 0x00007f169161a59f in _PG_init () at oracle_fdw.c:709
#6 0x000000000080c476 in internal_load_library (libname=<optimized out>, libname(at)entry=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:276
#7 0x000000000080c708 in RestoreLibraryState (start_address=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:741
#8 0x00000000004e72cf in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1069
#9 0x000000000069b0bf in StartBackgroundWorker () at bgworker.c:742
#10 0x00000000006a70bd in do_start_bgworker (rw=rw(at)entry=0x1f3b8d0) at postmaster.c:5612
#11 0x00000000006a722a in maybe_start_bgworkers () at postmaster.c:5809
#12 0x00000000006a7cf3 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4990
#13 <signal handler called>
#14 0x00007f16a1007783 in __select_nocancel () from /lib64/libc.so.6
#15 0x00000000006a80e5 in ServerLoop () at postmaster.c:1683
#16 0x00000000006a93c0 in PostmasterMain (argc=argc(at)entry=1, argv=argv(at)entry=0x1f13cc0) at postmaster.c:1327
#17 0x0000000000626905 in main (argc=1, argv=0x1f13cc0) at main.c:228
(gdb) quit
[centos(at)asia data]$ gdb ../bin/postgres core.30431
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/centos/pgbuild/bin/postgres...done.
[New LWP 30431]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: bgworker: parallel worke'.
Program terminated with signal 6, Aborted.
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 libaio-0.3.109-13.el7.x86_64
(gdb) bt
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
#1 0x00007f16a0f4e8e8 in abort () from /lib64/libc.so.6
#2 0x00000000008094b4 in errfinish (dummy=dummy(at)entry=0) at elog.c:557
#3 0x000000000080aea2 in elog_finish (elevel=elevel(at)entry=22, fmt=fmt(at)entry=0x9d9965 "invalid cache ID: %d") at elog.c:1378
#4 0x00000000007ffd82 in SearchSysCacheList (cacheId=cacheId(at)entry=41, nkeys=nkeys(at)entry=2, key1=key1(at)entry=139734905138463, key2=<optimized out>, key3=key3(at)entry=0, key4=key4(at)entry=0) at syscache.c:1210
#5 0x00007f169161a59f in _PG_init () at oracle_fdw.c:709
#6 0x000000000080c476 in internal_load_library (libname=<optimized out>, libname(at)entry=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:276
#7 0x000000000080c708 in RestoreLibraryState (start_address=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:741
#8 0x00000000004e72cf in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1069
#9 0x000000000069b0bf in StartBackgroundWorker () at bgworker.c:742
#10 0x00000000006a70bd in do_start_bgworker (rw=rw(at)entry=0x1f3b730) at postmaster.c:5612
#11 0x00000000006a722a in maybe_start_bgworkers () at postmaster.c:5809
#12 0x00000000006a7cf3 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4990
#13 <signal handler called>
#14 0x00007f16a1007783 in __select_nocancel () from /lib64/libc.so.6
#15 0x00000000006a80e5 in ServerLoop () at postmaster.c:1683
#16 0x00000000006a93c0 in PostmasterMain (argc=argc(at)entry=1, argv=argv(at)entry=0x1f13cc0) at postmaster.c:1327
#17 0x0000000000626905 in main (argc=1, argv=0x1f13cc0) at main.c:228
(gdb) quit
[centos(at)asia data]$ gdb ../bin/postgres core.30432
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/centos/pgbuild/bin/postgres...done.
[New LWP 30432]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: bgworker: parallel worke'.
Program terminated with signal 6, Aborted.
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 libaio-0.3.109-13.el7.x86_64
(gdb) bt
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
#1 0x00007f16a0f4e8e8 in abort () from /lib64/libc.so.6
#2 0x00000000008094b4 in errfinish (dummy=dummy(at)entry=0) at elog.c:557
#3 0x000000000080aea2 in elog_finish (elevel=elevel(at)entry=22, fmt=fmt(at)entry=0x9d9965 "invalid cache ID: %d") at elog.c:1378
#4 0x00000000007ffd82 in SearchSysCacheList (cacheId=cacheId(at)entry=41, nkeys=nkeys(at)entry=2, key1=key1(at)entry=139734905138463, key2=<optimized out>, key3=key3(at)entry=0, key4=key4(at)entry=0) at syscache.c:1210
#5 0x00007f169161a59f in _PG_init () at oracle_fdw.c:709
#6 0x000000000080c476 in internal_load_library (libname=<optimized out>, libname(at)entry=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:276
#7 0x000000000080c708 in RestoreLibraryState (start_address=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:741
#8 0x00000000004e72cf in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1069
#9 0x000000000069b0bf in StartBackgroundWorker () at bgworker.c:742
#10 0x00000000006a70bd in do_start_bgworker (rw=rw(at)entry=0x1f3b590) at postmaster.c:5612
#11 0x00000000006a722a in maybe_start_bgworkers () at postmaster.c:5809
#12 0x00000000006a7cf3 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4990
#13 <signal handler called>
#14 0x00007f16a1007783 in __select_nocancel () from /lib64/libc.so.6
#15 0x00000000006a80e5 in ServerLoop () at postmaster.c:1683
#16 0x00000000006a93c0 in PostmasterMain (argc=argc(at)entry=1, argv=argv(at)entry=0x1f13cc0) at postmaster.c:1327
#17 0x0000000000626905 in main (argc=1, argv=0x1f13cc0) at main.c:228
(gdb) quit
[centos(at)asia data]$ gdb ../bin/postgres core.30433
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/centos/pgbuild/bin/postgres...done.
[New LWP 30433]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: bgworker: parallel worke'.
Program terminated with signal 6, Aborted.
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 libaio-0.3.109-13.el7.x86_64
(gdb) bt
#0 0x00007f16a0f4d1f7 in raise () from /lib64/libc.so.6
#1 0x00007f16a0f4e8e8 in abort () from /lib64/libc.so.6
#2 0x00000000008094b4 in errfinish (dummy=dummy(at)entry=0) at elog.c:557
#3 0x000000000080aea2 in elog_finish (elevel=elevel(at)entry=22, fmt=fmt(at)entry=0x9d9965 "invalid cache ID: %d") at elog.c:1378
#4 0x00000000007ffd82 in SearchSysCacheList (cacheId=cacheId(at)entry=41, nkeys=nkeys(at)entry=2, key1=key1(at)entry=139734905138463, key2=<optimized out>, key3=key3(at)entry=0, key4=key4(at)entry=0) at syscache.c:1210
#5 0x00007f169161a59f in _PG_init () at oracle_fdw.c:709
#6 0x000000000080c476 in internal_load_library (libname=<optimized out>, libname(at)entry=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:276
#7 0x000000000080c708 in RestoreLibraryState (start_address=0x7f16a1bfdde8 <Address 0x7f16a1bfdde8 out of bounds>) at dfmgr.c:741
#8 0x00000000004e72cf in ParallelWorkerMain (main_arg=<optimized out>) at parallel.c:1069
#9 0x000000000069b0bf in StartBackgroundWorker () at bgworker.c:742
#10 0x00000000006a70bd in do_start_bgworker (rw=rw(at)entry=0x1f3b390) at postmaster.c:5612
#11 0x00000000006a722a in maybe_start_bgworkers () at postmaster.c:5809
#12 0x00000000006a7cf3 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:4990
#13 <signal handler called>
#14 0x00007f16a1007783 in __select_nocancel () from /lib64/libc.so.6
#15 0x00000000006a80e5 in ServerLoop () at postmaster.c:1683
#16 0x00000000006a93c0 in PostmasterMain (argc=argc(at)entry=1, argv=argv(at)entry=0x1f13cc0) at postmaster.c:1327
#17 0x0000000000626905 in main (argc=1, argv=0x1f13cc0) at main.c:228
(gdb) quit
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-11-19 18:33:01 | Re: How to store multiple rows in array . |
Previous Message | Pavel Stehule | 2017-11-19 17:59:47 | Re: How to store multiple rows in array . |