Quick Links

Re: Incorrect CHUNKHDRSZ in nodeAgg.c

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Incorrect CHUNKHDRSZ in nodeAgg.c
Date:	2025-01-10 10:30:23
Message-ID:	CAApHDvqv1aNB4cM36FzRwivXrEvBO_LsG_eQ3nqDXTjECaatOQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 9 Jan 2025 at 09:50, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> Attached POC patch, which reduces memory usage by ~15% for a simple
> distinct query on an integer key. Performance is the same or perhaps a
> hair faster.
>
> It's not many lines of code, but the surrounding code might benefit
> from some refactoring which would make it a bit simpler.

Thanks for working on this. Here's a preliminary review:

Since bump.c does not add headers to the palloc'd chunks, I think the
following code from hash_agg_entry_size() shouldn't be using
CHUNKHDRSZ anymore.

tupleChunkSize = CHUNKHDRSZ + tupleSize;

if (pergroupSize > 0)
pergroupChunkSize = CHUNKHDRSZ + pergroupSize;
else
pergroupChunkSize = 0;

You should be able to get rid of pergroupChunkSize and just use
pergroupSize in the return.

I did some benchmarking using the attached script. There's a general
speedup, but I saw some unexpected increase in the number of batches
with the patched version on certain tests. See the attached results.
For example, the work_mem = 8MB with 10 million rows shows "Batches:
129" on master but "Batches: 641" with the patched version. I didn't
check why.

David

Attachment	Content-Type	Size
hashagg_bench.sh.txt	text/plain	1.3 KB
Bump_Alloc_for_Hashagg_bench_results.txt	text/plain	7.8 KB

In response to

Re: Incorrect CHUNKHDRSZ in nodeAgg.c at 2025-01-08 20:50:10 from Jeff Davis

Responses

Re: Incorrect CHUNKHDRSZ in nodeAgg.c at 2025-01-13 20:09:54 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ants Aasma	2025-01-10 10:33:39	Re: AIO v2.0
Previous Message	Ajin Cherian	2025-01-10 10:08:17	Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.