BUG #14180: Segmentation fault on replication slave

From: boa(at)neogrid(dot)dk
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14180: Segmentation fault on replication slave
Date: 2016-06-07 09:16:18
Message-ID: 20160607091618.1385.29368@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14180
Logged by: Bo Ørsted Andresen
Email address: boa(at)neogrid(dot)dk
PostgreSQL version: 9.5.3
Operating system: Ubuntu 16.04 LTS
Description:

Hello,

We have a replication slot setup where the replication causes a segmentation
fault within eight hours after a rebuild of the slave.

In the following the master is IP 10.0.0.2 and the slave is IP 10.0.0.3.

On the master we have the defaults and the following:

/etc/postgresql/9.5/main/postgresql.conf
----------------------------------------
listen_addresses = '*'
port = 5433
wal_level = archive
archive_mode = on
archive_command = 'test ! -f /mnt/postgres_archive/%f && cp %p
/mnt/postgres_archive/%f'
max_wal_senders = 5
wal_keep_segments = 4000
max_replication_slots = 1
timezone = 'UTC'
----------------------------------------

/etc/postgresql/9.5/main/pg_hba.conf
----------------------------------------
host replication postgres 10.0.0.3/32 trust
----------------------------------------

On the slave we have the defaults and the timezone changed:

/etc/postgresql/9.5/main/postgresql.conf.
----------------------------------------
timezone = 'UTC'
----------------------------------------

On the master we run the SQL query:

SELECT * FROM pg_create_physical_replication_slot('slave');

On the slave we run the command:

pg_basebackup -P -R -X stream -c fast -h 10.0.0.2 -p 5433 -U postgres -D
/var/lib/postgresql/9.5/main

After this recovery.conf looks like this (where we added the slot line):

standby_mode = 'on'
primary_conninfo = 'user=postgres host=10.0.0.2 port=5433 sslmode=prefer
sslcompression=1 krbsrvname=postgres'
primary_slot_name = 'slave'

Then we fix the ownership and start the slave database. After a while -
anything between ten minutes and eight hours we get this error in the log
file:

2016-06-03 05:55:27 UTC [27303-4] LOG: startup process (PID 27305) was
terminated by signal 11: Segmentation fault
2016-06-03 05:55:27 UTC [27303-5] LOG: terminating any other active server
processes

If we attach with gdb before the segmentation fault we get:

# gdb -p 30524
GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 30524
Reading symbols from /usr/lib/postgresql/9.5/bin/postgres...Reading symbols
from
/usr/lib/debug/.build-id/c6/7444cae2dbc6bcac46e8052921c01c06780d72.debug...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libxml2.so.2...Reading
symbols from
/usr/lib/debug/.build-id/a1/55c7bc345d0e0b711be09120204bd88f475f9e.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libpam.so.0...(no debugging
symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libssl.so.1.0.0...Reading symbols
from
/usr/lib/debug/.build-id/82/2754695e4b31ae82937258bdff3d52efa0ba36.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libcrypto.so.1.0.0...Reading
symbols from
/usr/lib/debug/.build-id/b7/5a96c59be1b5b54fbf1a91ed722bec9406288e.debug...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2...(no
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/librt-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.23.so...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libm-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2...Reading
symbols from
/usr/lib/debug/.build-id/ad/f6f41f223d42193165fa0c55871f02d915fb19.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libc-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libicuuc.so.55...Reading
symbols from
/usr/lib/debug/.build-id/32/3e4878073bb4e0d7b174ae24e383ec5e05d68a.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...Reading symbols from
/usr/lib/debug/.build-id/34/0b7b463f981b8a0fb3451751f881df1b0c2f74.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/liblzma.so.5...(no debugging
symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libaudit.so.1...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5.so.3...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libk5crypto.so.3...(no
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libcom_err.so.2...Reading symbols
from /usr/lib/debug//lib/x86_64-linux-gnu/libcom_err.so.2.1...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5support.so.0...(no
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols
from
/usr/lib/debug/.build-id/b7/7847cc9cacbca3b5753d0d25a32e5795afe75b.debug...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/ld-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2...Reading
symbols from
/usr/lib/debug/.build-id/6b/9f4061a1d44813a54da4dbb0088f529d8d78ea.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libresolv.so.2...Reading symbols
from /usr/lib/debug//lib/x86_64-linux-gnu/libresolv-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libsasl2.so.2...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgssapi.so.3...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgnutls.so.30...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libicudata.so.55...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libstdc++.so.6...(no
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...Reading symbols
from /usr/lib/debug//lib/x86_64-linux-gnu/libgcc_s.so.1...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libkeyutils.so.1...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libheimntlm.so.0...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libkrb5.so.26...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libasn1.so.8...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhcrypto.so.4...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libroken.so.18...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libp11-kit.so.0...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libidn.so.11...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libtasn1.so.6...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libnettle.so.6...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhogweed.so.4...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libgmp.so.10...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libwind.so.0...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libheimbase.so.1...(no
debugging symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libhx509.so.5...(no debugging
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libsqlite3.so.0...Reading
symbols from
/usr/lib/debug/.build-id/d9/782ba023caec26b15d8676e3a5d07b55e121ef.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libcrypt.so.1...Reading symbols
from /usr/lib/debug//lib/x86_64-linux-gnu/libcrypt-2.23.so...done.
done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libffi.so.6...Reading symbols
from
/usr/lib/debug/.build-id/9d/9c958f1f4894afef6aecd90d1c430ea29ac34f.debug...done.
done.
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading
symbols from
/usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.23.so...done.
done.
0x00007f819925de70 in __poll_nocancel () at
../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) set pagination off
(gdb) set logging file /tmp/gdb.log
(gdb) set logging on
Copying output to /tmp/gdb.log
(gdb) handle SIGUSR1 nostop
Signal Stop Print Pass to program Description
SIGUSR1 No Yes Yes User defined signal 1
(gdb) handle SIGUSR1 noprint
Signal Stop Print Pass to program Description
SIGUSR1 No No Yes User defined signal 1
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
_bt_restore_page (page=0x7f816fce2b40 "", from=0x55a0945abb70 "\036",
len=<optimized out>) at
/build/postgresql-9.5-xp9utH/postgresql-9.5-9.5.3/build/../src/backend/access/nbtree/nbtxlog.c:57
57
/build/postgresql-9.5-xp9utH/postgresql-9.5-9.5.3/build/../src/backend/access/nbtree/nbtxlog.c:
No such file or directory.
(gdb) bt
#0 _bt_restore_page (page=0x7f816fce2b40 "", from=0x55a0945abb70 "\036",
len=<optimized out>) at
/build/postgresql-9.5-xp9utH/postgresql-9.5-9.5.3/build/../src/backend/access/nbtree/nbtxlog.c:57
#1 0x0000000000000000 in ?? ()
(gdb) p from
$1 = 0x55a0945abb70 "\036"
(gdb) p end
$2 = 0x55a0945ac928 "\305cO"
(gdb) p i
$3 = 3324
(gdb) p len
$4 = <optimized out>
(gdb) p &itupdata
$5 = (IndexTupleData *) 0x7ffe83ea84e0
(gdb) p items
$6 = {0x0 <repeats 408 times>}
(gdb) p &items
$7 = (Item (*)[408]) 0x7ffe83ea8820
(gdb) p itemsz
$8 = <optimized out>
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/lib/postgresql/9.5/bin/postgres
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
"root" execution of the PostgreSQL server is not permitted.
The server must be started under an unprivileged user ID to prevent
possible system security compromise. See the documentation for
more information on how to properly start the server.
[Inferior 1 (process 1887) exited with code 01]
(gdb) quit

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-06-07 11:58:59 Re: BUG #14178: output of jsonb_object and json_object doesn't match textually
Previous Message Thomas Munro 2016-06-07 07:55:10 Re: BUG #14178: output of jsonb_object and json_object doesn't match textually