Quick Links

Re: BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	mguissine(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot
Date:	2023-06-14 01:23:32
Message-ID:	ZIkWlJNVSN1hKnYw@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Wed, Jun 14, 2023 at 12:05:32AM +0000, PG Bug reporting form wrote:
> We are running relatively large and busy Postgres database on RDS and using
> logical replication extensively. We currently have 7 walsenders and while we
> often see replication falls behind due to high transactional volume, we've
> never experienced memory issues in 14.6 and below. After recent upgrade to
> 14.8, we already had several incidents where walsender processes RES memory
> would suddenly increase to over 80GB each causing freeable memory on the
> instance to go down to zero. Interesting that even after Instance reboot,
> the memory used by walsender processes won't get released until we restart
> the replication and drop the logical slots. The logical_decoding_work_mem
> was set to 512MB in time of the last incident but we recently lowered it to
> 128MB.
>
> Any known issues in pg 14.8 that would trigger this behaviour?

Yes, there are known issues with memory handling in logical
replication setups. See for example this thread:
https://www.postgresql.org/message-id/CAMnUB3oYugXCBLSkih+qNsWQPciEwos6g_AMbnz_peNoxfHwyw@mail.gmail.com

This is not a simple problem, unfortunately :/
--
Michael

In response to

BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot at 2023-06-14 00:05:32 from PG Bug reporting form

Responses

Re: BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot at 2023-06-14 22:15:03 from Andres Freund

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Michael Paquier	2023-06-14 04:21:40	Re: BUG #17888: Incorrect memory access in gist__int_ops for an input array with many elements
Previous Message	PG Bug reporting form	2023-06-14 00:05:32	BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot