Re: atomic pin/unpin causing errors

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: atomic pin/unpin causing errors
Date: 2016-05-06 18:15:03
Message-ID: CAMkU=1w81mDchzwcyZdkHU1NbuFWLOH1aKbnug9Otn1RgP3NZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 5, 2016 at 11:52 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi Jeff,
>
> On 2016-04-29 10:38:55 -0700, Jeff Janes wrote:
>> I don't see the problem with an cassert-enabled, probably because it
>> is just too slow to ever reach the point where the problem occurs.
>
> Running the test with cassert enabled I actually get assertion failures,
> due to the FATAL you added.
>
> #1 0x0000000000958dde in ExceptionalCondition (conditionName=0xb36c2a "!(RefCountErrors == 0)", errorType=0xb361af "FailedAssertion",
> fileName=0xb36170 "/home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c", lineNumber=2506) at /home/admin/src/postgresql/src/backend/utils/error/assert.c:54
> #2 0x00000000007c9fc9 in CheckForBufferLeaks () at /home/admin/src/postgresql/src/backend/storage/buffer/bufmgr.c:2506
...
>
> You didn't see those?

Yes, I have been seeing those on assert-enabled builds going back as
far as I can remember (long before this particular problem started
showing up). I just assumed it was a natural consequence of throwing
an ERROR from inside a critical section. I never really understood
it, why would a panicking process bother to check for buffer leaks in
the first place? It is leaking everything, which is why the entire
system has to be brought down immediately.

I have been trying (and failing) to reproduce the problem in more
recent releases, with and without cassert. Here is pg_config output
of one of my current attempts:

BINDIR = /home/jjanes/pgsql/torn_bisect/bin
DOCDIR = /home/jjanes/pgsql/torn_bisect/share/doc
HTMLDIR = /home/jjanes/pgsql/torn_bisect/share/doc
INCLUDEDIR = /home/jjanes/pgsql/torn_bisect/include
PKGINCLUDEDIR = /home/jjanes/pgsql/torn_bisect/include
INCLUDEDIR-SERVER = /home/jjanes/pgsql/torn_bisect/include/server
LIBDIR = /home/jjanes/pgsql/torn_bisect/lib
PKGLIBDIR = /home/jjanes/pgsql/torn_bisect/lib
LOCALEDIR = /home/jjanes/pgsql/torn_bisect/share/locale
MANDIR = /home/jjanes/pgsql/torn_bisect/share/man
SHAREDIR = /home/jjanes/pgsql/torn_bisect/share
SYSCONFDIR = /home/jjanes/pgsql/torn_bisect/etc
PGXS = /home/jjanes/pgsql/torn_bisect/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = 'CFLAGS=-ggdb' '--with-extra-version=-c1543a8'
'--enable-debug' '--with-libxml' '--with-perl' '--with-python'
'--with-ldap' '--with-openssl' '--with-gssapi' '--enable-cassert'
'--prefix=/home/jjanes/pgsql/torn_bisect/'
CC = gcc
CPPFLAGS = -DFRONTEND -D_GNU_SOURCE -I/usr/include/libxml2
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -ggdb
CFLAGS_SL = -fpic
LDFLAGS = -L../../src/common -Wl,--as-needed
-Wl,-rpath,'/home/jjanes/pgsql/torn_bisect/lib',--enable-new-dtags
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgcommon -lpgport -lxml2 -lssl -lcrypto -lgssapi_krb5 -lz
-lreadline -lrt -lcrypt -ldl -lm
VERSION = PostgreSQL 9.6devel-c1543a8

The only difference between this and the ones that did find the ERR
would be toggling --enable-cassert and changing which git commit was
used (and manually applying the gin_alone patch when testing commits
that precede that one's committal.

Linux: 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 03:35:39 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux

/proc/cpu_info:

processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-4640 v2 @ 2.20GHz
stepping : 4
microcode : 4294967295
cpu MHz : 2199.933
cache size : 20480 KB
physical id : 0
siblings : 8
core id : 7
cpu cores : 8
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm
constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1
sse4_2 popcnt aes xsave avx f16c rdrand hypervisor lahf_lm xsaveopt
fsgsbase smep erms
bogomips : 4399.86
clflush size : 64
cache_alignment : 64
address sizes : 42 bits physical, 48 bits virtual
power management:

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-05-06 18:24:06 Re: atomic pin/unpin causing errors
Previous Message Stephen Frost 2016-05-06 18:07:36 Re: SET ROLE and reserved roles