From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: 9.5: Better memory accounting, towards memory-bounded HashAgg |
Date: | 2014-11-21 19:45:31 |
Message-ID: | 546F965B.1090202@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21.11.2014 00:03, Andres Freund wrote:
> On 2014-11-17 21:03:07 +0100, Tomas Vondra wrote:
>> On 17.11.2014 19:46, Andres Freund wrote:
>>
>>> The MemoryContextData struct is embedded into AllocSetContext.
>>
>> Oh, right. That makes is slightly more complicated, though, because
>> AllocSetContext adds 6 x 8B fields plus an in-line array of
>> freelist pointers. Which is 11x8 bytes. So in total 56+56+88=200B,
>> without the additional field. There might be some difference
>> because of alignment, but I still don't see how that one
>> additional field might impact cachelines?
>
> It's actually 196 bytes:
Ummmm, I think the pahole output shows 192, not 196? Otherwise it
wouldn't be exactly 3 cachelines anyway.
But yeah - my math-foo was weak for a moment, because 6x8 != 56. Which
is the 8B difference :-/
> struct AllocSetContext {
> MemoryContextData header; /* 0 56 */
> AllocBlock blocks; /* 56 8 */
> /* --- cacheline 1 boundary (64 bytes) --- */
> AllocChunk freelist[11]; /* 64 88 */
> /* --- cacheline 2 boundary (128 bytes) was 24 bytes ago --- */
> Size initBlockSize; /* 152 8 */
> Size maxBlockSize; /* 160 8 */
> Size nextBlockSize; /* 168 8 */
> Size allocChunkLimit; /* 176 8 */
> AllocBlock keeper; /* 184 8 */
> /* --- cacheline 3 boundary (192 bytes) --- */
>
> /* size: 192, cachelines: 3, members: 8 */
> };
>
> And thus one additional field tipps it over the edge.
>
> "pahole" is a very useful utility.
Indeed.
>> But if we separated the freelist, that might actually make it
>> faster, at least for calls that don't touch the freelist at all,
>> no? Because most of the palloc() calls will be handled from the
>> current block.
>
> I seriously doubt it. The additional indirection + additional
> branches are likely to make it worse.
That's possible, although I tried it on "my version" of the accounting
patch, and it showed slight improvement (lower overhead) on Robert's
reindex benchmark.
The question is how would that work with regular workload, because
moving the freelist out of the structure makes it smaller (2 cachelines
instead of 3), and it can only impact workloads working with the
freelists (i.e. either by calling free, or realloc, or whatever).
Although palloc() checks the freelist too ...
Also, those pieces may be allocated together (next to each other), which
might keep locality.
But I haven't tested any of this, and my knowledge of this low-level
stuff is poor, so I might be completely wrong.
regard
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2014-11-21 19:49:26 | Re: psql \sf doesn't show it's SQL when ECHO_HIDDEN is on |
Previous Message | Tom Lane | 2014-11-21 19:44:04 | Re: psql \sf doesn't show it's SQL when ECHO_HIDDEN is on |