Re: Using per-transaction memory contexts for storing decoded tuples

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using per-transaction memory contexts for storing decoded tuples
Date: 2024-09-17 09:05:51
Message-ID: CAA4eK1KoehTY85kNCEUAV2F-ubd==WyHhJFuv6ybWGqk_GKGAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2024 at 10:43 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Sep 13, 2024 at 3:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Sep 12, 2024 at 4:03 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > We have several reports that logical decoding uses memory much more
> > > than logical_decoding_work_mem[1][2][3]. For instance in one of the
> > > reports[1], even though users set logical_decoding_work_mem to
> > > '256MB', a walsender process was killed by OOM because of using more
> > > than 4GB memory.
> > >
> > > As per the discussion in these threads so far, what happened was that
> > > there was huge memory fragmentation in rb->tup_context.
> > > rb->tup_context uses GenerationContext with 8MB memory blocks. We
> > > cannot free memory blocks until all memory chunks in the block are
> > > freed. If there is a long-running transaction making changes, its
> > > changes could be spread across many memory blocks and we end up not
> > > being able to free memory blocks unless the long-running transaction
> > > is evicted or completed. Since we don't account fragmentation, block
> > > header size, nor chunk header size into per-transaction memory usage
> > > (i.e. txn->size), rb->size could be less than
> > > logical_decoding_work_mem but the actual allocated memory in the
> > > context is much higher than logical_decoding_work_mem.
> > >
> >
> > It is not clear to me how the fragmentation happens. Is it because of
> > some interleaving transactions which are even ended but the memory
> > corresponding to them is not released?
>
> In a generation context, we can free a memory block only when all
> memory chunks there are freed. Therefore, individual tuple buffers are
> already pfree()'ed but the underlying memory blocks are not freed.
>

I understood this part but didn't understand the cases leading to this
problem. For example, if there is a large (and only) transaction in
the system that allocates many buffers for change records during
decoding, in the end, it should free memory for all the buffers
allocated in the transaction. So, wouldn't that free all the memory
chunks corresponding to the memory blocks allocated? My guess was that
we couldn't free all the chunks because there were small interleaving
transactions that allocated memory but didn't free it before the large
transaction ended.

> > Can we try reducing the size of
> > 8MB memory blocks? The comment atop allocation says: "XXX the
> > allocation sizes used below pre-date generation context's block
> > growing code. These values should likely be benchmarked and set to
> > more suitable values.", so do we need some tuning here?
>
> Reducing the size of the 8MB memory block would be one solution and
> could be better as it could be back-patchable. It would mitigate the
> problem but would not resolve it. I agree to try reducing it and do
> some benchmark tests. If it reasonably makes the problem less likely
> to happen, it would be a good solution.
>

makes sense.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari Mannsåker 2024-09-17 09:24:33 [PATCH] Mention service key word more prominently in pg_service.conf docs
Previous Message Corey Huinker 2024-09-17 09:02:49 Re: Statistics Import and Export