Re: Logical Replica ReorderBuffer Size Accounting Issues

From: Alex Richman <alexrichman(at)onesignal(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Niels Stevens <niels(dot)stevens(at)onesignal(dot)com>
Subject: Re: Logical Replica ReorderBuffer Size Accounting Issues
Date: 2023-02-16 19:08:14
Message-ID: CAMnUB3o_=q4FYZ5yqCgy=TQUk=ce49RNr7CJD4EGP_=KuYrzFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all,

Looping back to say we updated to 15.2 and are still seeing this issue,
though it is less prevalent.

Thanks,
- Alex.

On Wed, 18 Jan 2023 at 11:16, Alex Richman <alexrichman(at)onesignal(dot)com>
wrote:

>
>
> On Wed, 18 Jan 2023 at 10:10, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>> Alex,
>> Do we see this problem with small tuples as well? I see from your
>> earlier email that tuple size is ~800 bytes in the production
>> environment. It is possible that after commit 1b0d9aa4 such kind of
>> problems are not there with small tuple sizes but that commit happened
>> in PG15 whereas your production environment might be on a prior
>> release.
>>
>
> Hi Amit,
>
> Our prod environment is also on 15.1, which is where we first saw the
> issue, so I'm afraid the issue still seems to be present here.
>
> Looping back on the earlier discussion, we applied the malloc patch from
> [1] ([2]) to a prod server, which also fixes the issue there. Attached is
> a graph of the last 48 hours of memory usage, the ~200GB spikes are
> instances of the LR walsender memory issue.
> After patch is applied (blue mark), baseline memory drops and we no longer
> see the spikes. Per-process memory stats corroborate that the LR walsender
> memory is now never more than a few MB RSS per process.
>
> Thanks,
> - Alex.
>
> [1]
> https://www.postgresql.org/message-id/CAMnUB3pwknqoe5s-bGuRD8nX1bWkZRbFF%3DjWNLTWbm_etFigkA%40mail.gmail.com
> [2]
> https://gist.github.com/alex-richman-onesignal/4ad147b37eaab99f41a150b51899a564
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2023-02-16 19:31:08 Re: BUG #17797: connection error
Previous Message PG Bug reporting form 2023-02-16 16:03:46 BUG #17797: connection error