From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: backends stuck in "startup" |
Date: | 2017-11-23 00:43:50 |
Message-ID: | 14525.1511397830@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> For starters, I found that PID 27427 has:
> (gdb) p proc->lwWaiting
> $1 = 0 '\000'
> (gdb) p proc->lwWaitMode
> $2 = 1 '\001'
To confirm, this is LWLockAcquire's "proc", equal to MyProc?
If so, and if LWLockAcquire is blocked at PGSemaphoreLock,
that sure seems like a smoking gun.
> Note: I've compiled locally PG 10.1 with PREFERRED_SEMAPHORES=SYSV to keep the
> service up (and to the degree that serves to verify that avoids the issue,
> great).
Good idea, I was going to suggest that. It will be very interesting
to see if that makes the problem go away.
> Would you suggest how I can maximize the likelyhood/speed of triggering that ?
> Five years ago, with a report of similar symptoms, you said "You need to hack
> pgbench to suppress the single initialization connection it normally likes to
> make, else the test degenerates to the one-incoming-connection case"
> https://www.postgresql.org/message-id/8896.1337998337%40sss.pgh.pa.us
I don't think that case was related at all.
My theory suggests that any contended use of an LWLock is at risk,
in which case just running pgbench with about as many sessions as
you have in the live server ought to be able to trigger it. However,
that doesn't really account for your having observed the problem
only during session startup, so there may be some other factor
involved. I wonder if it only happens during the first wait for
an LWLock ... and if so, how could that be?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-11-23 00:57:36 | Re: query causes connection termination |
Previous Message | Tomas Vondra | 2017-11-23 00:32:20 | Re: query causes connection termination |