Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)
Date: 2023-08-16 11:20:23
Message-ID: 3029.1692184823@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> On Tue, Aug 15, 2023 at 2:23 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> > I'm not familiar with the condition variable code enough to have an
> > opinion, but the patch seems to resolve the issue for me - I can no
> > longer reproduce the high CPU usage.
>
> Thanks, pushed.

I try to understand this patch (commit 5ffb7c7750) because I use condition
variable in an extension. One particular problem occured to me, please
consider:

ConditionVariableSleep() gets interrupted, so AbortTransaction() calls
ConditionVariableCancelSleep(), but the signal was sent in between. Shouldn't
at least AbortTransaction() and AbortSubTransaction() check the return value
of ConditionVariableCancelSleep(), and re-send the signal if needed?

Note that I'm just thinking about such a problem, did not try to reproduce it.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-08-16 11:21:30 RE: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Alvaro Herrera 2023-08-16 11:15:46 Re: Performance degradation on concurrent COPY into a single relation in PG16.