From: | Mark Simonetti <marks(at)opalsoftware(dot)co(dot)uk> |
---|---|
To: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Hang on NOTIFY |
Date: | 2015-08-07 11:32:38 |
Message-ID: | 55C49756.70505@opalsoftware.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The system I am developing makes extensive use of the async
NOTIFY/LISTEN system.
I am currently experiencing a problem on 2 production servers:
Server 1:
Virtual Windows Server 2008 R2 (VMWare)
PostgreSQL 9.3.5
Server 2:
Virtual Windows Server 2008 R2 (VMWare)
PostgreSQL 9.4.2
After the system has been running for a period of time, sometimes a few
days sometimes a few weeks, any calls to NOTIFY
will hang.
After in depth investigation it appears to happen when a listening
backend has been connected for some time (days).
Any other backend trying to inform that backend will hang on
"CallNamedPipe" in pgkill (kill.c).
Here is a stack trace from the hung SENDING backend, main thread : -
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
kernel32(dot)dll!_CallNamedPipeW(at)28() + 0xf4 bytes
postgres.exe!pgkill(int pid, int sig) Line 43 + 0x2b bytes C
postgres.exe!SendProcSignal(int pid, ProcSignalReason reason, int
backendId) Line 198 + 0x10 bytes C
postgres.exe!SignalBackends() Line 1497 + 0xe bytes C
> postgres.exe!ProcessCompletedNotifies() Line 1092 C
postgres.exe!PostgresMain(int argc, char * * argv, const char *
dbname, const char * username) Line 3947 C
postgres.exe!BackendRun(Port * port) Line 4011 + 0x21 bytes C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4515
+ 0x8 bytes C
postgres.exe!main(int argc, char * * argv) Line 203 + 0x7 bytes C
postgres.exe!__tmainCRTStartup() Line 555 + 0x17 bytes C
kernel32(dot)dll!(at)BaseThreadInitThunk@12() + 0x12 bytes
ntdll(dot)dll!___RtlUserThreadStart(at)8() + 0x27 bytes
ntdll(dot)dll!__RtlUserThreadStart(at)8() + 0x1b bytes
Here is a stack trace from the signalling thread (I know its irrelevent
as this is for incomming signals) : -
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
> postgres.exe!pg_signal_thread(void * param) Line 279 + 0x9 bytes C
Now for the RECIPIENT backend : -
ntdll(dot)dll!_ZwWaitForMultipleObjects(at)20() + 0x15 bytes
ntdll(dot)dll!_ZwWaitForMultipleObjects(at)20() + 0x15 bytes
KERNELBASE(dot)dll!_WaitForMultipleObjectsEx(at)20() + 0x36 bytes
kernel32(dot)dll!_WaitForMultipleObjectsExImplementation(at)20() + 0x8e
bytes
> postgres.exe!pgwin32_waitforsinglesocket(unsigned int s, int
what, int timeout) Line 216 + 0x14 bytes C
postgres.exe!pgwin32_recv(unsigned int s, char * buf, int len, int
f) Line 352 + 0xa bytes C
postgres.exe!secure_read(Port * port, void * ptr, unsigned int
len) Line 304 + 0x12 bytes C
postgres.exe!pq_getbyte() Line 895 + 0x67 bytes C
postgres.exe!SocketBackend(StringInfoData * inBuf) Line 344 + 0x5
bytes C
postgres.exe!PostgresMain(int argc, char * * argv, const char *
dbname, const char * username) Line 3968 + 0x1c bytes C
postgres.exe!BackendRun(Port * port) Line 4011 + 0x21 bytes C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4515
+ 0x8 bytes C
postgres.exe!main(int argc, char * * argv) Line 203 + 0x7 bytes C
postgres.exe!__tmainCRTStartup() Line 555 + 0x17 bytes C
kernel32(dot)dll!(at)BaseThreadInitThunk@12() + 0x12 bytes
ntdll(dot)dll!___RtlUserThreadStart(at)8() + 0x27 bytes
ntdll(dot)dll!__RtlUserThreadStart(at)8() + 0x1b bytes
This is the usual place for it to wait, so this seems okay.
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
ntdll(dot)dll!_NtFsControlFile(at)40() + 0x15 bytes
> postgres.exe!pg_signal_thread(void * param) Line 279 + 0x9 bytes C
Also looks fine.
This seems like a possible Windows bug, as the call to CallNamedPipe has
a timeout of 1000 milliseconds, but it is clearly not timing out. It
only seems to exit if I exit the backend it is trying to signal.
NOTE: it is trying to send to many backends, but on all the stuck
backends I checked, they all were stuck sending to the same recipient.
Closing that particular recipient DOES free everything up and signals
start flowing again.
I've searched around and cannot find a similar bug report. Is it
possibly something I'm doing wrong?
Thanks,
Mark.
--
From | Date | Subject | |
---|---|---|---|
Next Message | beijing_pg | 2015-08-07 12:16:43 | BUG #13541: There is a visibility issue when run some DDL and Query. The time window is very shot |
Previous Message | Bruce Momjian | 2015-08-06 16:24:28 | Re: BUG #13540: upsert is not good |