hung postmaster?

From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: hung postmaster?
Date: 2005-02-16 05:51:59
Message-ID: 200502152251.59613.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


I'm seeing some unpleasant database cluster seizures. After
running fine for hours, days, even weeks, all of a sudden new
connections via psql, DBI, libpq, all completely hang with no
log message or error, while existing connections can continue to
execute queries, log messages, etc. Postmaster is totally
unresponsive to SIGTERM, SIGINT, and SIGQUIT (sigkill is the
only thing that shuts it down). Memory, CPU, and available
connections are plentiful. Top, ps, glance, ls, netstat all are
very responsive. I/O load has been pretty high, averaging
700-1100 physical IOs/second. The first time this occurred,
last Friday, it was a mix of our 7.3.4 and 7.4.6 clusters.
Yesterday, it was all of the 7.4.6 64-bit clusters, none of the
32-bit 7.3.4 clusters. Today it was two of the 7.4.6 clusters
and no others.

I'm trying to get gdb installed so I can attach to postmaster and
get a backtrace. Other troubleshooting ideas appreciated.
Details below...

CONFIG:
========
Hardware: One HP 64-bit Itanium rx4640, 16gb RAM, 4 cpus, SAN
(Cisco switches, HP EVA-5000 disk array and FC HBA's).
OS: HP-UX B.11.23
Pgsql: 9 clusters installed and concurrently running. 4/9 are
32-bit PostgreSQL 7.3.4 on ia64-hp-hpux11.22, compiled by cc
-Ae. The other 5 clusters are 64-bit PostgreSQL 7.4.6 on
ia64-hp-hpux11.23, compiled by gcc 3.3.2.

(We have 2 other identical boxes running without incident with
similar cluster mixes of 7.3.4, 7.3.7, 7.4.6.)

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Antonios Christofides 2005-02-16 09:04:00 Re: Trading off large objects (arrays, large strings, large tables) for timeseries
Previous Message Neil Conway 2005-02-16 04:39:14 Re: Need to check palloc() return value?