Re: heavily contended lwlocks with long wait queues scale badly

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: heavily contended lwlocks with long wait queues scale badly
Date: 2022-11-01 07:16:51
Message-ID: CALj2ACUb+2uxW7MP3SBzt9VL0QHsog7t5H6WtwH+e4h5JTJ1RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 1, 2022 at 5:21 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2022-10-31 16:21:06 +0530, Bharath Rupireddy wrote:
> > BTW, I've seen a sporadic crash (SEGV) with the patch in bg writer
> > with the same set up [1], I'm not sure if it's really because of the
> > patch. I'm unable to reproduce it now and unfortunately I didn't
> > capture further details when it occurred.
>
> That's likely because the prototype patch I submitted in this thread missed
> updating LWLockUpdateVar().
>
> Updated patch attached.

Thanks. It looks good to me. However, some minor comments on the v3 patch:

1.
- if (MyProc->lwWaiting)
+ if (MyProc->lwWaiting != LW_WS_NOT_WAITING)
elog(PANIC, "queueing for lock while waiting on another one");

Can the above condition be MyProc->lwWaiting == LW_WS_WAITING ||
MyProc->lwWaiting == LW_WS_PENDING_WAKEUP for better readability?

Or add an assertion Assert(MyProc->lwWaiting != LW_WS_WAITING &&
MyProc->lwWaiting != LW_WS_PENDING_WAKEUP); before setting
LW_WS_WAITING?

2.
/* Awaken any waiters I removed from the queue. */
proclist_foreach_modify(iter, &wakeup, lwWaitLink)
{

@@ -1044,7 +1052,7 @@ LWLockWakeup(LWLock *lock)
* another lock.
*/
pg_write_barrier();
- waiter->lwWaiting = false;
+ waiter->lwWaiting = LW_WS_NOT_WAITING;
PGSemaphoreUnlock(waiter->sem);
}

/*
* Awaken any waiters I removed from the queue.
*/
proclist_foreach_modify(iter, &wakeup, lwWaitLink)
{
PGPROC *waiter = GetPGProcByNumber(iter.cur);

proclist_delete(&wakeup, iter.cur, lwWaitLink);
/* check comment in LWLockWakeup() about this barrier */
pg_write_barrier();
waiter->lwWaiting = LW_WS_NOT_WAITING;

Can we add an assertion Assert(waiter->lwWaiting ==
LW_WS_PENDING_WAKEUP) in the above two places? We prepare the wakeup
list and set the LW_WS_NOT_WAITING flag in above loops, having an
assertion is better here IMO.

Below are test results with v3 patch. +1 for back-patching it.

HEAD PATCHED
1 34142 34289
2 72760 69720
4 136300 131848
8 210809 210192
16 240718 242744
32 297587 297354
64 341939 343036
128 383615 383801
256 342094 337680
512 263194 288629
768 145526 261553
1024 107267 241811
2048 35716 188389
4096 12415 120300

PG15 PATCHED
1 34503 34078
2 73708 72054
4 139415 133321
8 212396 211390
16 242227 242584
32 303441 309288
64 362680 339211
128 378645 344291
256 340016 344291
512 290044 293337
768 140277 264618
1024 96191 247636
2048 35158 181488
4096 12164 118610

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-11-01 07:22:59 Re: Improve description of XLOG_RUNNING_XACTS
Previous Message Michael Paquier 2022-11-01 07:03:11 Re: ICU for global collation