Re: Memory leak in pg_logical_slot_{get,peek}_changes

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Memory leak in pg_logical_slot_{get,peek}_changes
Date: 2024-12-12 15:02:15
Message-ID: CALDaNm1kAHWeBEVKBqp=EnCwOS3qB_OjcKF8zz5J8EpXUcAi+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 11 Dec 2024 at 11:39, Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, December 11, 2024 12:28 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > Hi,
> >
> > I'm starting a new thread for one of the issues reported by Sawada-san at [1].
> >
> > This is a memory leak on CacheMemoryContext when using pgoutput via SQL
> > APIs:
> > /* Map must live as long as the session does. */
> > oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> >
> > entry->attrmap = build_attrmap_by_name_if_req(indesc, outdesc, false);
> >
> > MemoryContextSwitchTo(oldctx);
> > RelationClose(ancestor);
> >
> > entry->attrmap is pfree'd only when validating the RelationSyncEntry
> > so remains even after logical decoding API calls.
> >
> > This issue can be reproduced with the following test:
> > -- Create table
> > create table t1( c1 int, c2 int, c3 int) PARTITION BY LIST (c1);
> > create table t1_1( c2 int, c1 int, c3 int);
> > ALTER TABLE t1 ATTACH PARTITION t1_1 FOR VALUES IN (0, 1, 2, 3);
> >
> > -- Create publication
> > create publication pub1 FOR TABLE t1, t1_1 WITH
> > (publish_via_partition_root = true);
> >
> > -- Create replication slot
> > SELECT 'init' FROM pg_create_logical_replication_slot('test', 'pgoutput');
> >
> > -- Run the below steps between Test start and Test end many times to
> > see the memory usage getting increased
> > -- Test start
> > insert into t1 values( 1);
> > SELECT count(*) FROM pg_logical_slot_get_binary_changes('test', NULL,
> > NULL, 'proto_version', '4', 'publication_names', 'pub1');
> >
> > -- Memory increases after each invalidation and insert
> > SELECT * FROM pg_backend_memory_contexts where name =
> > 'CacheMemoryContext';
> > -- Test end
> >
> > The attached patch resolves a memory leak by ensuring that the
> > attribute map is properly freed during plugin shutdown. This process
> > is triggered by the SQL API when the decoding context is being
> > released. Additionally, I am conducting a review to identify and
> > address any similar memory leaks that may exist elsewhere in the code.
>
> Thanks for reporting the issue and share the fix.
>
> I am not sure if freeing them in shutdown callback is safe, because shutdown
> callback Is not invoked in case of ERRORs. I think we'd better allocate them
> under cachectx in the beginning to avoid freeing them explicitly.
>
> The attachment is a POC that I internally experimented before. It replaces not
> only the attrmap's context, but also others which were allocated under
> CacheMemoryContext, to be consistent. Note that I have not tested it deeply.
> Feel free to use if it works for you.
>
> But the patch can only be used since PG15 where cachectx is introduced. In
> older braches, we may need to think some other ways.

Since we cannot add a new member to PGOutputData, how about creating a
new global static memory context specifically for storing the relation
entry attribute map? This memory context could be allocated and reset
during pgoutput_startup. This way, if a SQL call is made after an
error, the context will be reset, preventing memory bloat as addressed
in the attached patch. Thoughts?

Regards,
Vignesh

Attachment Content-Type Size
PG14_-0001-Fix-memory-leak-in-pgoutput-relation-attribut.patch text/x-patch 2.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Christensen 2024-12-12 15:15:55 Re: [PATCHES] Post-special page storage TDE support
Previous Message vignesh C 2024-12-12 14:54:32 Re: Memory leak in pg_logical_slot_{get,peek}_changes