Re: Logical replication - ERROR: could not send data to WAL stream: cannot allocate memory for input buffer

From: Aleš Zelený <zeleny(dot)ales(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Logical replication - ERROR: could not send data to WAL stream: cannot allocate memory for input buffer
Date: 2020-06-16 16:29:55
Message-ID: CAODqTUYzC5xgQQBscEcgNe_pH3bCjc9-VaNVJuMedm8+yhxdvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the comment.

from what I was able to monitor memory usage was almost stable and there
were about 20GB allocated as cached memory. Memory overcommit is disabled
on the database server. Might it be a memory issue, since wit was
synchronizing newly added tables with a sum of 380 GB of data containing
JSONB columns (60 bytes to 100kBytes). The problem is, that I was not able
to reproduce it since in dev environment it wors like a charm an as usual
on PROD we were facing this issue.

It is clear that for memory allocation issues testcase would be
appropriate, but I was not able to build reproducible testcase.

Thanks Ales

po 8. 6. 2020 v 8:41 odesílatel Michael Paquier <michael(at)paquier(dot)xyz>
napsal:

> On Fri, Jun 05, 2020 at 10:57:46PM +0200, Aleš Zelený wrote:
> > we are using logical replication for more than 2 years and today I've
> found
> > new not yet know error message from wal receiver. The replication was in
> > catchup mode (on publisher side some new tables were created and added to
> > publication, on subscriber side they were missing).
>
> This comes from pqCheckInBufferSpace() in libpq when realloc() fails,
> most probably because this host ran out of memory.
>
> > Repeated several times, finally it proceeded and switch into streaming
> > state. The OS has 64GB RAM, OS + database instance are using usually 20GB
> > rest is used as OS buffers. I've checked monitoring (sampled every 10
> > seconds) and no memory usage peak was visible, so unless it was a very
> > short memory usage peak, I'd not expect the system running out of memory.
> >
> > Is there something I can do to diagnose and avoid this issue?
>
> Does the memory usage increase slowly over time? Perhaps it was not a
> peak and the memory usage was not steady? One thing that could always
> be tried if you are able to get a rather reproducible case would be to
> use valgrind and check if it is able to detect any leaks. And I am
> afraid that it is hard to act on this report without more information.
> --
> Michael
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Lewis 2020-06-16 17:05:35 Re: autovacuum failing on pg_largeobject and disk usage of the pg_largeobject growing unchecked
Previous Message Jim Hurne 2020-06-16 15:59:37 autovacuum failing on pg_largeobject and disk usage of the pg_largeobject growing unchecked