Re: "multiple backends attempting to wait for pincount 1"

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "multiple backends attempting to wait for pincount 1"
Date: 2015-02-23 15:25:53
Message-ID: 20150223152553.GC30784@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-17 13:14:00 -0500, Tom Lane wrote:
> Hm, good point. On the other hand, should we worry about the possibility
> of a lost signal? Moving the flag-clearing would guard against that,
> which the current code does not. But we've not seen field reports of such
> issues AFAIR, so this might not be an important consideration.

I think if there were lost signals there'd be much bigger problems given
the same (or in master) similar mechanics are used for a lot of other
things including heavyweight and lightweight locks wait queues.

> > ...
> > /*
> > * Make sure waiter flag is reset - it might not be if
> > * ProcWaitForSignal() returned for another reason than UnpinBuffer()
> > * signalling us.
> > */
> > LockBufHdr(bufHdr);
> > buf->flags &= ~BM_PIN_COUNT_WAITER;
> > Assert(bufHdr->wait_backend_pid == MyProcPid);
> > UnlockBufHdr(bufHdr);
>
> > PinCountWaitBuf = NULL;
> > /* Loop back and try again */
> > }
>
> > to the bottom of the loop would suffice.
>
> No, I disagree. If we maintain the rule that the signaler clears
> BM_PIN_COUNT_WAITER, then once that happens there is nothing to stop a
> third party from trying to LockBufferForCleanup on the same buffer (except
> for table-level locking conventions, which IMO this mechanism shouldn't be
> dependent on). So this coding would potentially clear the
> BM_PIN_COUNT_WAITER flag belonging to that third party, and then fail the
> Assert --- but only in debug builds, not in production, where it would
> just silently lock up the third-party waiter. So I think having a test to
> verify that it's still "our" BM_PIN_COUNT_WAITER flag is essential.

Pushed with a test guarding against that. I still think it might be
slightly better to error out if somebody else waits, but I guess it's
unlikely that we'd mistakenly add code doing that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2015-02-23 15:25:57 Primary not sending to synchronous standby
Previous Message Albe Laurenz 2015-02-23 15:01:04 Re: SSL renegotiation