Re: Use generation context to speed up tuplesorts

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Use generation context to speed up tuplesorts
Date: 2021-12-08 21:07:12
Message-ID: 567d1ea7-bfeb-bd97-1a7f-b13d2770258c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/8/21 16:51, Ronan Dunklau wrote:
> Le jeudi 9 septembre 2021, 15:37:59 CET Tomas Vondra a écrit :
>> And now comes the funny part - if I run it in the same backend as the
>> "full" benchmark, I get roughly the same results:
>>
>> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
>> ------------+------------+---------------+----------+---------
>> 32768 | 512 | 806256640 | 37159 | 76669
>>
>> but if I reconnect and run it in the new backend, I get this:
>>
>> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
>> ------------+------------+---------------+----------+---------
>> 32768 | 512 | 806158336 | 233909 | 100785
>> (1 row)
>>
>> It does not matter if I wait a bit before running the query, if I run it
>> repeatedly, etc. The machine is not doing anything else, the CPU is set
>> to use "performance" governor, etc.
>
> I've reproduced the behaviour you mention.
> I also noticed asm_exc_page_fault showing up in the perf report in that case.
>
> Running an strace on it shows that in one case, we have a lot of brk calls,
> while when we run in the same process as the previous tests, we don't.
>
> My suspicion is that the previous workload makes glibc malloc change it's
> trim_threshold and possibly other dynamic options, which leads to constantly
> moving the brk pointer in one case and not the other.
>
> Running your fifo test with absurd malloc options shows that indeed that might
> be the case (I needed to change several, because changing one disable the
> dynamic adjustment for every single one of them, and malloc would fall back to
> using mmap and freeing it on each iteration):
>
> mallopt(M_TOP_PAD, 1024 * 1024 * 1024);
> mallopt(M_TRIM_THRESHOLD, 256 * 1024 * 1024);
> mallopt(M_MMAP_THRESHOLD, 4*1024*1024*sizeof(long));
>
> I get the following results for your self contained test. I ran the query
> twice, in each case, seeing the importance of the first allocation and the
> subsequent ones:
>
> With default malloc options:
>
> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
> ------------+------------+---------------+----------+---------
> 32768 | 512 | 795836416 | 300156 | 207557
>
> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
> ------------+------------+---------------+----------+---------
> 32768 | 512 | 795836416 | 211942 | 77207
>
>
> With the oversized values above:
>
> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
> ------------+------------+---------------+----------+---------
> 32768 | 512 | 795836416 | 219000 | 36223
>
>
> block_size | chunk_size | mem_allocated | alloc_ms | free_ms
> ------------+------------+---------------+----------+---------
> 32768 | 512 | 795836416 | 75761 | 78082
> (1 row)
>
> I can't tell how representative your benchmark extension would be of real life
> allocation / free patterns, but there is probably something we can improve
> here.
>

Thanks for looking at this. I think those allocation / free patterns are
fairly extreme, and there probably are no workloads doing exactly this.
The idea is the actual workloads are likely some combination of these
extreme cases.

> I'll try to see if I can understand more precisely what is happening.
>

Thanks, that'd be helpful. Maybe we can learn something about tuning
malloc parameters to get significantly better performance.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-12-08 22:45:48 Re: row filtering for logical replication
Previous Message Peter Smith 2021-12-08 20:30:48 Fix typos - "an" instead of "a"