Re: Incorrect CHUNKHDRSZ in nodeAgg.c

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Incorrect CHUNKHDRSZ in nodeAgg.c
Date: 2025-01-13 20:09:54
Message-ID: 9360e33d5595c1fa97a3c081833edd4e8cbe8f60.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2025-01-10 at 23:30 +1300, David Rowley wrote:
> Since bump.c does not add headers to the palloc'd chunks, I think the
> following code from hash_agg_entry_size() shouldn't be using
> CHUNKHDRSZ anymore.

Fixed.

I also tried to account for the power-of-two allocations for the
transition values. We don't do that in other places, but now that we
have the bump allocator which does not do that, it seems reasonable to
account for it here.

> I did some benchmarking using the attached script. There's a general
> speedup, but I saw some unexpected increase in the number of batches
> with the patched version on certain tests. See the attached results.
> For example, the work_mem = 8MB with 10 million rows shows "Batches:
> 129" on master but "Batches: 641" with the patched version. I didn't
> check why.

Somewhat counter-intuitively, HashAgg can use more batches when there
is more memory available. If already spilling, creating more small
batches is good, because it reduces the chances of recursing. The
limiting factor for creating a lot of tiny batches is that each
partition requires a logtape with its own write buffer, so if there's
more memory available, that allows creating more logtapes and a higher
partitioning fanout.

The runtimes I got running your tests are mixed. I'm still analyzing
whether test noise is a factor, or whether the increased number of
partitions is a factor. But any runtime regressions are minor in
comparison to the memory savings, so I think we are on the right track.

Attached v2.

Regards,
Jeff Davis

Attachment Content-Type Size
v2-0001-HashAgg-use-Bump-allocator-for-hash-TupleHashTabl.patch text/x-patch 11.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2025-01-13 20:18:18 Re: explain analyze rows=%.0f
Previous Message Alexander Lakhin 2025-01-13 20:00:00 Re: InitControlFile misbehaving on graviton