From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: SR standby hangs |
Date: | 2011-04-26 20:28:51 |
Message-ID: | 21470.1303849731@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> This has happened again. This time we have some debug info available,
> and can possible get more, if people tell me what will be helpful:
> (gdb) f 2
> #2 0x00000000005de735 in LockBufferForCleanup (buffer=310163) at
> bufmgr.c:2432
> 2432 ProcWaitForSignal();
> (gdb) p *bufHdr
> $2 = {tag = {rnode = {spcNode = 16393, dbNode = 40475, relNode =
> 41880}, forkNum = MAIN_FORKNUM, blockNum = 18913}, flags = 6,
> usage_count = 1, refcount = 1, wait_backend_pid = 9111,
> buf_hdr_lock = 0 '\000', buf_id = 310162, freeNext = -2,
> io_in_progress_lock = 620448, content_lock = 620449}
Well, that's pretty interesting: refcount is only 1, and the
BM_PIN_COUNT_WAITER flag is not set. AFAICS this *must* mean that the
buffer had been pinned and whoever had it (presumably bgwriter) did
UnpinBuffer(). So it appears that the signal just plain got lost :-(,
which suggests a kernel bug. What platform is this on, again?
> Strangely, I don't even see ProcWaitForSignal() in the frame list - I
> shouldn't have thought it was a candidate to be optimised away.
It has a tail call to PGSemaphoreLock, so not totally surprising.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2011-04-26 20:39:19 | Re: offline consistency check and info on attributes |
Previous Message | Peter Eisentraut | 2011-04-26 20:24:12 | sequence privileges in information schema |