From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-04-15 04:02:25 |
Message-ID: | CABPTF7VbszDaqyqS8WeyxJgyyjHvb9x1L3Ls7gSwr3KuiBr0wg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I've moved this entry to the July CommitFest. The CI reported an unused
variable warning in patch v5, so I've addressed it by removing the unused
one. Sorry for not catching the issue locally.
Xuneng Zhou <xunengzhou(at)gmail(dot)com> 于2025年4月10日周四 23:43写道:
> Hi,
>
> I’ve updated and rebased Maxim's patch version 4 with optimizations like
> ring buffer and capped max_requests proposed by Heikki. These are the
> summary of Changes in v5:
>
> • Replaced the linear array model in AbsorbSyncRequests() with a bounded
> ring buffer to avoid large memmove() operations when processing sync
> requests.
>
> • Introduced a tunable batch size (CKPT_REQ_BATCH_SIZE, default: 10,000)
> to incrementally absorb requests while respecting MaxAllocSize.
>
> • Capped max_requests with a hard upper bound (MAX_CHECKPOINT_REQUESTS =
> 10,000,000) to avoid pathological memory growth when shared_buffers is
> very large.
>
> • Updated CompactCheckpointerRequestQueue() to support the ring buffer
> layout.
>
> • Retained the existing compacting logic but modified it to operate
> in-place over the ring.
>
>> >> The patch itself looks ok to me. I'm curious about the trade-offs
>> >> between this incremental approach and the alternative of
>> >> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
>> >> of splitting the requests into fixed-size slices avoids OOM
>> >> failures or process termination by the OOM killer, which is good.
>>
>> +1
>>
>> >> However, it does add some overhead with additional lock acquisition/
>> >> release cycles and memory movement operations via memmove(). The
>> >> natural question is whether the security justify the cost.
>>
>> Hmm, if you turned the array into a ring buffer, you could absorb part
>> of the queue without the memmove().
>>
>> With that, I'd actually suggest using a much smaller allocation, maybe
>> 10000 entries or even less. That should be enough to keep the lock
>> acquire/release overhead acceptable
>>
>
>
>> >> Regarding the slice size of 1 GB, is this derived from
>> >> MaxAllocSize limit, or was it chosen for other performance
>> >> reasons? whether a different size might offer better performance
>> >> under typical workloads?
>> >
>> > I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
>> > relatively old one, and no one expected the number of requests to
>> exceed
>> > 1 GB. Now we have the ability to set the shared_buffers to a huge
>> number
>> > (without discussing now whether this makes any real sense), thus this
>> > limit for palloc becomes a problem.
>>
>> Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
>> NBuffers, which makes some sense so that you can fit the worst case
>> scenario of quickly evicting all pages from the buffer cache. But even
>> then I'd expect the checkpointer to be able to absorb the changes before
>> the queue fills up. And we have the compaction logic now too.
>>
>> So I wonder if we should cap max_requests at, say, 10 million requests
>> now?
>>
>> CompactCheckpointerRequestQueue() makes a pretty large allocation too,
>> BTW, although much smaller than the one in AbsorbSyncRequests(). The
>> hash table it builds could grow quite large though.
>>
>> I’m not certain what the optimal cap for max_requests should be—whether
> it’s 10 million(setting it to 10 million would result in a peak temporary
> allocation of roughly 700 MB within CompactCheckpointerRequestQueue), 1
> million, or some other value—but introducing an upper bound seems
> important. Without a cap, max_requests could theoretically scale with
> NBuffers, potentially resulting in excessive temporary memory allocations.
>
> As this is my first patch submission, it might be somewhat naive and
> experimental— I appreciate your patience and feedback.
>
Attachment | Content-Type | Size |
---|---|---|
v6-0001-Process-sync-requests-incrementally-in-AbsorbSyncReq.patch | application/octet-stream | 9.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amul Sul | 2025-04-15 04:22:19 | pg_combinebackup: correct code comment. |
Previous Message | Tender Wang | 2025-04-15 03:20:53 | Re: not null constraints, again |