Segmentation fault with core dump

From: Joshua Berry <yoberi(at)gmail(dot)com>
To: PostgreSQL - General <pgsql-general(at)postgresql(dot)org>
Subject: Segmentation fault with core dump
Date: 2013-04-10 21:34:40
Message-ID: CAPmZXM03MEDEn6nqqf_Phs3M1DK-EaXP5_K-LmirneOJMAQ-Hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Group,

I'm using PG 9.1.9 with a client application using various versions of the
pgsqlODBC driver on Windows. Cursors are used heavily, as well as some
pretty heavy trigger queries on db writes which update several materialized
views.

The server has 48GB RAM installed, PG is configured for 12GB shared
buffers, 8MB max_stack_depth, 32MB temp_buffers, and 2MB work_mem. Most of
the other settings are defaults.

The server will seg fault from every few days to up to two weeks. Each time
one of the postgres server processes seg faults, the server gets terminated
by signal 11, restarts in recovery for up to 30 seconds, after which time
it accepts connections as if nothing ever happened. Unfortunately all the
open cursors and connections are lost, so the client apps are left in a bad
state.

Seg faults have also occurred with PG 8.4. However that server's DELL OMSA
(hardware health monitoring system) began to report RAM parity errors, so I
figured that the seg faults were due to hardware issues and I did not
configure the system to save core files in order to debug. I migrated the
database to a server running PG9.1 with the hopes that the problem would
disappear, but it has not. So now I'm starting to debug.

Below are the relevant details. I'm not terribly savvy with gdb, so please
let me know what else I could/should examine from the core dump, as well as
anything else about the system/configuration.

Kind Regards,
-Joshua

#NB: some info in square brackets has been [redacted]
# grep postmaster /var/log/messages
Apr 10 13:18:32 [hostname] kernel: postmaster[17356]: segfault at 40 ip
0000000000710e2e sp 00007fffd193ca70 error 4 in postgres[400000+4ea000]

gdb /usr/pgsql-9.1/bin/postmaster -c core.17356
[...loading/reading symbols...]
Core was generated by `postgres: [username] [databasename]
[client_ipaddress](1500) SELECT '.
Program terminated with signal 11, Segmentation fault.
#0 ResourceOwnerEnlargeCatCacheRefs (owner=0x0) at resowner.c:605
605 if (owner->ncatrefs < owner->maxcatrefs)
(gdb) q

# uname -a
Linux [hostname] 2.6.32-358.2.1.el6.x86_64 #1 SMP Tue Mar 12 14:18:09 CDT
2013 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/redhat-release
Scientific Linux release 6.3 (Carbon)

# psql -U jberry
psql (9.1.9)
Type "help" for help.

jberry=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-3), 64-bit
(1 row)

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2013-04-10 21:40:32 Re: Segmentation fault with core dump
Previous Message John R Pierce 2013-04-10 21:00:31 Re: How to convert US date format to European date format ?