From: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: scalability bottlenecks with (many) partitions (and more) |
Date: | 2024-01-29 15:42:27 |
Message-ID: | 1782373.VLH7GnMWUR@aivenlaptop |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Le lundi 29 janvier 2024, 15:59:04 CET Tomas Vondra a écrit :
> I'm not sure work_mem is a good parameter to drive this. It doesn't say
> how much memory we expect the backend to use - it's a per-operation
> limit, so it doesn't work particularly well with partitioning (e.g. with
> 100 partitions, we may get 100 nodes, which is completely unrelated to
> what work_mem says). A backend running the join query with 1000
> partitions uses ~90MB (judging by data reported by the mempool), even
> with work_mem=4MB. So setting the trim limit to 4MB is pretty useless.
I understand your point, I was basing my previous observations on what a
backend typically does during the execution.
>
> The mempool could tell us how much memory we need (but we could track
> this in some other way too, probably). And we could even adjust the mmap
> parameters regularly, based on current workload.
>
> But there's then there's the problem that the mmap parameters don't tell
> If we > > us how much memory to keep, but how large chunks to release.
>
> Let's say we want to keep the 90MB (to allocate the memory once and then
> reuse it). How would you do that? We could set MMAP_TRIM_TRESHOLD 100MB,
> but then it takes just a little bit of extra memory to release all the
> memory, or something.
For doing this you can set M_TOP_PAD using glibc malloc. Which makes sure a
certain amount of memory is always kept.
But the way the dynamic adjustment works makes it sort-of work like this.
MMAP_THRESHOLD and TRIM_THRESHOLD start with low values, meaning we don't
expect to keep much memory around.
So even "small" memory allocations will be served using mmap at first. Once
mmaped memory is released, glibc's consider it a benchmark for "normal"
allocations that can be routinely freed, and adjusts mmap_threshold to the
released mmaped region size, and trim threshold to two times that.
It means over time the two values will converge either to the max value (32MB
for MMAP_THRESHOLD, 64 for trim threshold) or to something big enough to
accomodate your released memory, since anything bigger than half trim
threshold will be allocated using mmap.
Setting any parameter disable that.
But I'm not arguing against the mempool, just chiming in with glibc's malloc
tuning possibilities :-)
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2024-01-29 15:45:01 | Re: to_regtype() Raises Error |
Previous Message | Mark Dilger | 2024-01-29 15:42:00 | Re: Should we remove -Wdeclaration-after-statement? |