Quick Links

Re: Proposal: Limitations of palloc inside checkpointer

From:	Xuneng Zhou <xunengzhou(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Proposal: Limitations of palloc inside checkpointer
Date:	2025-04-10 15:43:20
Message-ID:	CABPTF7WGU1dNjOn_Ea2Os78zqu90jEUr=oyveOtpG8bNVh87bA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

I’ve updated and rebased Maxim's patch version 4 with optimizations like
ring buffer and capped max_requests proposed by Heikki. These are the
summary of Changes in v5:

• Replaced the linear array model in AbsorbSyncRequests() with a bounded
ring buffer to avoid large memmove() operations when processing sync
requests.

• Introduced a tunable batch size (CKPT_REQ_BATCH_SIZE, default:
10,000) to incrementally
absorb requests while respecting MaxAllocSize.

• Capped max_requests with a hard upper bound (MAX_CHECKPOINT_REQUESTS =
10,000,000) to avoid pathological memory growth when shared_buffers is very
large.

• Updated CompactCheckpointerRequestQueue() to support the ring buffer
layout.

• Retained the existing compacting logic but modified it to operate
in-place over the ring.

> >> The patch itself looks ok to me. I'm curious about the trade-offs
> >> between this incremental approach and the alternative of
> >> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
> >> of splitting the requests into fixed-size slices avoids OOM
> >> failures or process termination by the OOM killer, which is good.
>
> +1
>
> >> However, it does add some overhead with additional lock acquisition/
> >> release cycles and memory movement operations via memmove(). The
> >> natural question is whether the security justify the cost.
>
> Hmm, if you turned the array into a ring buffer, you could absorb part
> of the queue without the memmove().
>
> With that, I'd actually suggest using a much smaller allocation, maybe
> 10000 entries or even less. That should be enough to keep the lock
> acquire/release overhead acceptable
>

> >> Regarding the slice size of 1 GB, is this derived from
> >> MaxAllocSize limit, or was it chosen for other performance
> >> reasons? whether a different size might offer better performance
> >> under typical workloads?
> >
> > I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
> > relatively old one, and no one expected the number of requests to exceed
> > 1 GB. Now we have the ability to set the shared_buffers to a huge number
> > (without discussing now whether this makes any real sense), thus this
> > limit for palloc becomes a problem.
>
> Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
> NBuffers, which makes some sense so that you can fit the worst case
> scenario of quickly evicting all pages from the buffer cache. But even
> then I'd expect the checkpointer to be able to absorb the changes before
> the queue fills up. And we have the compaction logic now too.
>
> So I wonder if we should cap max_requests at, say, 10 million requests now?
>
> CompactCheckpointerRequestQueue() makes a pretty large allocation too,
> BTW, although much smaller than the one in AbsorbSyncRequests(). The
> hash table it builds could grow quite large though.
>
> I’m not certain what the optimal cap for max_requests should be—whether
it’s 10 million(setting it to 10 million would result in a peak temporary
allocation of roughly 700 MB within CompactCheckpointerRequestQueue), 1
million, or some other value—but introducing an upper bound seems
important. Without a cap, max_requests could theoretically scale with
NBuffers, potentially resulting in excessive temporary memory allocations.

As this is my first patch submission, it might be somewhat naive and
experimental— I appreciate your patience and feedback.

Attachment	Content-Type	Size
v5-0001-Process-sync-requests-incrementally-in-AbsorbSyncReq.patch	application/octet-stream	9.4 KB

In response to

Re: Proposal: Limitations of palloc inside checkpointer at 2025-04-04 21:22:52 from Heikki Linnakangas

Responses

Re: Proposal: Limitations of palloc inside checkpointer at 2025-04-15 04:02:25 from Xuneng Zhou

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dave Cramer	2025-04-10 15:49:28	Re: Correct documentation for protocol version
Previous Message	Adrian Klaver	2025-04-10 15:22:56	Re: Capturing both IP address and hostname in the log