Re: Memory leak in WAL sender with pgoutput (v10~)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Memory leak in WAL sender with pgoutput (v10~)
Date: 2024-12-10 18:14:24
Message-ID: CAD21AoDbyijAUeKtL1=KNFUuSxqUZs-NAw3pr4tOEycycemf7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 10, 2024 at 1:16 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Dec 10, 2024 at 11:24 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 10, 2024 at 8:54 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > On Tue, 10 Dec 2024 at 04:56, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > > >
> > > > On Mon, Dec 09, 2024 at 03:36:15PM +0530, Amit Kapila wrote:
> > > > > It couldn't solve the problem completely even in back-branches. The
> > > > > SQL API case I mentioned and tested by Hou-San in the email [1] won't
> > > > > be solved.
> > > > >
> > > > > [1] - https://www.postgresql.org/message-id/OS0PR01MB57166A4DA0ABBB94F2FBB28694362%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> > > >
> > > > Yeah, exactly (wanted to reply exactly that yesterday but lacked time,
> > > > thanks!).
> > >
> > > Yes, that makes sense. How about something like the attached patch.
> > >
> >
> > - oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> > - if (data->publications)
> > - {
> > - list_free_deep(data->publications);
> > - data->publications = NIL;
> > - }
> > + static MemoryContext pubctx = NULL;
> > +
> > + if (pubctx == NULL)
> > + pubctx = AllocSetContextCreate(CacheMemoryContext,
> > + "logical replication publication list context",
> > + ALLOCSET_SMALL_SIZES);
> > + else
> > + MemoryContextReset(pubctx);
> > +
> > + oldctx = MemoryContextSwitchTo(pubctx);
> >
> > Considering the SQL API case, why is it okay to allocate this context
> > under CacheMemoryContext?
> >
>
> On further thinking, we can't allocate it under
> LogicalDecodingContext->context because once that is freed at the end
> of SQL API pg_logical_slot_get_changes(), pubctx will be pointing to a
> dangling memory. One idea is that we use
> MemoryContextRegisterResetCallback() to invoke a reset callback
> function where we can reset pubctx but not sure if we want to go there
> in back branches. OTOH, the currently proposed fix won't leak memory
> on repeated calls to pg_logical_slot_get_changes(), so that might be
> okay as well.
>
> Thoughts?

Alternative idea is to declare pubctx as a file static variable. And
we create the memory context under LogicalDecodingContext->context in
the startup callback and free it in the shutdown callback.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-12-10 18:37:30 Re: Assert failure on running a completed portal again
Previous Message Robert Haas 2024-12-10 18:14:13 Re: Assert failure on running a completed portal again