Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)

From: Alexander Farber <alexander(dot)farber(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
Date: 2010-10-04 09:20:34
Message-ID: AANLkTinofU6RTAqf=sSQX8ZSPuqm-+m0zgUN4Bx9-=FU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Postgres users,

I have a Linux website with phpBB serving a small Facebook game:

# uname -a
Linux XXXXX 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010
x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/redhat-release
CentOS release 5.5 (Final)
# rpm -qa | grep -i postgres
postgresql-devel-8.1.21-1.el5_5.1
postgresql-8.1.21-1.el5_5.1
postgresql-server-8.1.21-1.el5_5.1
postgresql-devel-8.1.21-1.el5_5.1
postgresql-libs-8.1.21-1.el5_5.1
postgresql-libs-8.1.21-1.el5_5.1
postgresql-docs-8.1.21-1.el5_5.1

It works generally ok and with little load since June.
(My game has less than 100 users and I have AMD/Quad+4GB)

On Friday I've installed a cronjob (full source code at the bottom):

3 3 * * * psql -a -f
$HOME/bin/clean-phpbb-forum.sql

and on Monday I've found out, that the site is overloaded:

top - 10:16:56 up 3 days, 23:56, 1 user, load average: 20.55, 21.38, 22.92
Tasks: 157 total, 24 running, 133 sleeping, 0 stopped, 0 zombie
Cpu(s): 98.9%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4019028k total, 3073968k used, 945060k free, 50604k buffers
Swap: 2104496k total, 132k used, 2104364k free, 2316196k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28974 postgres 16 0 122m 14m 10m R 73.3 0.4 20:56.06 postmaster
28727 postgres 16 0 121m 13m 10m R 54.7 0.3 37:58.58 postmaster
28714 postgres 16 0 122m 14m 10m R 50.4 0.4 38:38.98 postmaster
29412 postgres 16 0 121m 13m 10m R 46.5 0.4 8:03.96 postmaster
28542 postgres 16 0 122m 14m 10m R 46.2 0.4 58:49.38 postmaster
28482 postgres 16 0 122m 14m 10m R 45.8 0.4 61:53.37 postmaster
28468 postgres 16 0 122m 14m 10m R 44.2 0.4 62:46.17 postmaster
29017 postgres 16 0 122m 14m 10m R 43.9 0.4 19:17.06 postmaster
28929 postgres 15 0 122m 14m 10m R 42.2 0.4 22:01.43 postmaster
28500 postgres 16 0 122m 14m 10m R 41.3 0.4 59:40.23 postmaster
28460 postgres 16 0 122m 14m 10m R 40.6 0.4 64:17.16 postmaster
28894 postgres 16 0 122m 14m 10m R 38.6 0.4 23:35.53 postmaster
28489 postgres 16 0 122m 14m 10m R 36.0 0.4 60:32.59 postmaster
28719 postgres 15 0 121m 13m 10m R 25.2 0.3 38:10.33 postmaster
29496 postgres 16 0 121m 13m 10m R 22.9 0.4 4:20.32 postmaster
28556 postgres 15 0 122m 14m 10m R 17.7 0.4 57:32.62 postmaster
28735 postgres 15 0 122m 14m 10m R 15.7 0.4 36:09.45 postmaster
29602 postgres 15 0 119m 11m 9680 S 8.2 0.3 0:00.25 postmaster
28457 postgres 17 0 122m 14m 9m R 3.6 0.4 64:34.38 postmaster
26092 apache 15 0 238m 16m 3740 S 0.3 0.4 0:03.38 httpd
29596 afarber 15 0 12744 1116 800 R 0.3 0.0 0:00.09 top
1 root 15 0 10352 700 592 S 0.0 0.0 0:01.69 init

I understand, that I probably supply not enough info,
but how would you approach debugging this problem?

I've run my cronjob numerous times from CLI - it works ok
and takes only few seconds. I've installed it as an every-3 mins
cronjob on my development VM - it works ok there too.

My biggest problem is, that I don't see any information
from Postgres at the production site - why did it have to start
so many postmaster's (whatever those processes do).

The only log file I've found has been /var/lib/pgsql/pgstartup.log
and there is nothing suspicious there. (Also nothing related to Postgres
in messages, mcelog (empty), audit.log, access_log, error_log).

Please give me some hints
Alex

# cat bin/clean-phpbb-forum.sql
start transaction;

delete from phpbb_users
where user_type=1 and user_inactive_reason=1 and user_lastvisit=0 and
age(to_timestamp(user_regdate))>interval '3 days';

create temp table old_topics (topic_id integer) on commit delete rows;
create temp table old_posts (post_id integer) on commit delete rows;

insert into old_topics select topic_id from phpbb_topics
where forum_id=5 and topic_poster=1 and
age(to_timestamp(topic_time))>interval '7 days';

-- select count(topic_id) as "old topics:" from old_topics;

insert into old_posts select p.post_id from phpbb_posts p, old_topics t
where p.forum_id=5 and p.poster_id=1 and p.topic_id=t.topic_id;

-- select count(post_id) as "old posts:" from old_posts;

delete from phpbb_posts where post_id in (select post_id from old_posts);
delete from phpbb_topics where topic_id in (select topic_id from old_topics);

update phpbb_config set
config_value = (select count(topic_id) from phpbb_topics)
where config_name = 'num_topics';

update phpbb_config set
config_value = (select count(post_id) from phpbb_posts)
where config_name = 'num_posts';

update phpbb_users set
user_posts = (select count(post_id) from phpbb_posts where poster_id=1)
where user_id = 1;

update phpbb_forums set
forum_posts = (select count(post_id) from phpbb_posts),
forum_topics = (select count(topic_id) from phpbb_topics),
forum_topics_real = (select count(topic_id) from phpbb_topics)
where forum_id = 5;

commit

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2010-10-04 09:25:51 Re: Any advice on debugging hanging postgresql-8.1.21 (many postmaster's)
Previous Message Craig Ringer 2010-10-04 07:31:27 Re: Win2k? Really?