From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-04-10 15:43:20 |
Message-ID: | CABPTF7WGU1dNjOn_Ea2Os78zqu90jEUr=oyveOtpG8bNVh87bA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I’ve updated and rebased Maxim's patch version 4 with optimizations like
ring buffer and capped max_requests proposed by Heikki. These are the
summary of Changes in v5:
• Replaced the linear array model in AbsorbSyncRequests() with a bounded
ring buffer to avoid large memmove() operations when processing sync
requests.
• Introduced a tunable batch size (CKPT_REQ_BATCH_SIZE, default:
10,000) to incrementally
absorb requests while respecting MaxAllocSize.
• Capped max_requests with a hard upper bound (MAX_CHECKPOINT_REQUESTS =
10,000,000) to avoid pathological memory growth when shared_buffers is very
large.
• Updated CompactCheckpointerRequestQueue() to support the ring buffer
layout.
• Retained the existing compacting logic but modified it to operate
in-place over the ring.
> >> The patch itself looks ok to me. I'm curious about the trade-offs
> >> between this incremental approach and the alternative of
> >> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
> >> of splitting the requests into fixed-size slices avoids OOM
> >> failures or process termination by the OOM killer, which is good.
>
> +1
>
> >> However, it does add some overhead with additional lock acquisition/
> >> release cycles and memory movement operations via memmove(). The
> >> natural question is whether the security justify the cost.
>
> Hmm, if you turned the array into a ring buffer, you could absorb part
> of the queue without the memmove().
>
> With that, I'd actually suggest using a much smaller allocation, maybe
> 10000 entries or even less. That should be enough to keep the lock
> acquire/release overhead acceptable
>
> >> Regarding the slice size of 1 GB, is this derived from
> >> MaxAllocSize limit, or was it chosen for other performance
> >> reasons? whether a different size might offer better performance
> >> under typical workloads?
> >
> > I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
> > relatively old one, and no one expected the number of requests to exceed
> > 1 GB. Now we have the ability to set the shared_buffers to a huge number
> > (without discussing now whether this makes any real sense), thus this
> > limit for palloc becomes a problem.
>
> Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
> NBuffers, which makes some sense so that you can fit the worst case
> scenario of quickly evicting all pages from the buffer cache. But even
> then I'd expect the checkpointer to be able to absorb the changes before
> the queue fills up. And we have the compaction logic now too.
>
> So I wonder if we should cap max_requests at, say, 10 million requests now?
>
> CompactCheckpointerRequestQueue() makes a pretty large allocation too,
> BTW, although much smaller than the one in AbsorbSyncRequests(). The
> hash table it builds could grow quite large though.
>
> I’m not certain what the optimal cap for max_requests should be—whether
it’s 10 million(setting it to 10 million would result in a peak temporary
allocation of roughly 700 MB within CompactCheckpointerRequestQueue), 1
million, or some other value—but introducing an upper bound seems
important. Without a cap, max_requests could theoretically scale with
NBuffers, potentially resulting in excessive temporary memory allocations.
As this is my first patch submission, it might be somewhat naive and
experimental— I appreciate your patience and feedback.
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Process-sync-requests-incrementally-in-AbsorbSyncReq.patch | application/octet-stream | 9.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Cramer | 2025-04-10 15:49:28 | Re: Correct documentation for protocol version |
Previous Message | Adrian Klaver | 2025-04-10 15:22:56 | Re: Capturing both IP address and hostname in the log |