| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | pgsql-hackers(at)postgreSQL(dot)org |
| Subject: | Re: Something is broken about connection startup |
| Date: | 2016-11-10 23:04:34 |
| Message-ID: | 23418.1478819074@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
I wrote:
> A quick look through the sources confirms that this error implies that
> SearchSysCache on the RELOID cache must have failed to find a tuple for
> pg_proc --- there are many occurrences of this text, but they all are
> reporting that. Which absolutely should not be happening now that we use
> MVCC catalog scans, concurrent updates or no. So I think this is a bug,
> and possibly a fairly-recently-introduced one, because I can't remember
> seeing buildfarm failures like this one before.
After tweaking elog.c to promote FATAL to PANIC, I got stack traces
confirming that the error occurs here:
#0 0x0000003779a325e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003779a33dc5 in abort () at abort.c:92
#2 0x000000000080d177 in errfinish (dummy=<value optimized out>) at elog.c:560
#3 0x000000000080df94 in elog_finish (elevel=<value optimized out>,
fmt=<value optimized out>) at elog.c:1381
#4 0x0000000000801859 in RelationCacheInitializePhase3 () at relcache.c:3444
#5 0x000000000081a145 in InitPostgres (in_dbname=<value optimized out>, dboid=0,
username=<value optimized out>, useroid=<value optimized out>, out_dbname=0x0)
at postinit.c:982
#6 0x0000000000710c81 in PostgresMain (argc=1, argv=<value optimized out>,
dbname=0x24d4c40 "regression", username=0x24abc88 "postgres") at postgres.c:3728
#7 0x00000000006a6eae in BackendRun (argc=<value optimized out>,
argv=<value optimized out>) at postmaster.c:4271
#8 BackendStartup (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:3945
#9 ServerLoop (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:1701
#10 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:1309
#11 0x00000000006273d8 in main (argc=3, argv=0x24a9b20) at main.c:228
So it's happening when RelationCacheInitializePhase3 is trying to replace
a fake pg_class row for pg_proc (made by formrdesc) with the real one.
That's even odder, because that's late enough that this should be a pretty
ordinary catalog lookup. Now I wonder if it's possible that this can be
seen during ordinary relation opens after connection startup. If so, it
would almost surely be a recently-introduced bug, else we'd have heard
about this from the field.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Scalia | 2016-11-10 23:11:55 | Re: Shared memory estimation for postgres |
| Previous Message | leoaaryan | 2016-11-10 22:57:08 | Shared memory estimation for postgres |