Quick Links

BUG #8701: recover process hang on slave

From:	amutu(at)amutu(dot)com
To:	pgsql-bugs(at)postgresql(dot)org
Subject:	BUG #8701: recover process hang on slave
Date:	2013-12-26 02:47:47
Message-ID:	E1Vw0zD-00042M-PU@wrigleys.postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

The following bug has been logged on the website:

Bug reference: 8701
Logged by: amutu
Email address: amutu(at)amutu(dot)com
PostgreSQL version: 9.1.9
Operating system: CentOS 6 x86-64
Description:

we have a master and two streaming salve pg.we find One of the slave
replay_location is far behand the other.

both sent_location is BF1/921F6000;the write_location and flush_location is
similar;but one of the server replay_location is BF1/9210DD10，the oter is
6DE/D958E8.

on the abnormal server，top show that a postgres process replay the
00000001000006DE00000000 WAL，and the process take up 100% usage of the cpu
core.

I try to restart the salve，but failed.
I get the core of the process,it shows：

Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `postgres: startup process recovering
00000001000006DE00000000'.
#0 0x00000000006264e8 in smgrclose ()
Missing separate debuginfos, use: debuginfo-install
glibc-2.12-1.49.tl1.x86_64
(gdb) bt
#0 0x00000000006264e8 in smgrclose ()
#1 0x00000000006265c8 in smgrcloseall ()
#2 0x0000000000495322 in XLogDropDatabase ()
#3 0x0000000000516253 in dbase_redo ()
#4 0x0000000000492d40 in StartupXLOG ()
#5 0x0000000000495148 in StartupProcessMain ()
#6 0x00000000004ac26f in AuxiliaryProcessMain ()
#7 0x00000000005eb383 in StartChildProcess ()
#8 0x00000000005ef3dc in PostmasterMain ()
#9 0x0000000000590fe8 in main ()

Responses

Re: BUG #8701: recover process hang on slave at 2013-12-26 19:30:01 from Alvaro Herrera
Re: BUG #8701: recover process hang on slave at 2013-12-26 21:47:00 from Sergey Konoplev

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Alexey Bashtanov	2013-12-26 11:39:30	postgresql tries to reuse plan but fails because the (dynamic) query has changed
Previous Message	Peter Geoghegan	2013-12-25 23:27:52	Obsolete comment above _bt_doinsert()