Re: Reduce TupleHashEntryData struct size by half

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reduce TupleHashEntryData struct size by half
Date: 2024-11-21 20:37:56
Message-ID: 7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


New patch series attached.

* a few cleanup patches, but 0001 and 0004 can affect initial hash
table sizing
* also pack AggStatePerGroupData (to 10 bytes)
* put additional data in the same chunk as firstTuple to avoid an
extra pointer an an extra allocation

On Tue, 2024-11-19 at 01:30 +0200, Heikki Linnakangas wrote:
> Sounds like a good idea. Needs some changes to the TupleTableSlotOps
> interface to avoid the memcpy I presume.

Neither firstTuple nor the additional data have a size known at compile
time, so it can't be represented by a struct. It seems best to keep
firstTuple at the beginning so that it matches the SH_KEY_TYPE, and
then just repalloc() to make room at the end for the additional data,
which avoids the memcpy unless it crosses a power-of-two boundary.

> >
> Queries that have a only a small number of groups might might benefit
> from storing a plain Datum/isnull array instead of a MinimalTuple.
> That
> would take more memory when you have a lot of groups though.

The current MinimalTuple representation is pretty wasteful for small
grouping keys, so I don't think it would be hard to beat.

> > Alternatively, MinimalTuple is not very "minimal", and perhaps we
> > can
> > just make it better.
>
> Yeah. It tries to be compatible with HeapTuple, but perhaps we should
> give up on that and pack it more tightly instead.

From the perspective of HashAgg, that seems worthwhile.

In my current patch set, depending on the grouping key and aggregates,
the chunk size for the combination of the firstTuple with the
additional data can be just above 32 bytes, which pushes the chunk size
to 64 bytes. If we cut down the mintuple overhead a bit, more cases
would fit in 32 bytes. And I think we can: 32 bytes seems reasonable to
hold a lot of common cases where there's a small grouping key and a
small pergroup state.

Regards,
Jeff Davis

Attachment Content-Type Size
v2-0001-ExecInitAgg-update-aggstate-numaggs-and-numtrans-.patch text/x-patch 1.5 KB
v2-0002-Add-missing-typedefs.list-entry-for-AggStatePerGr.patch text/x-patch 1.2 KB
v2-0003-Remove-unused-TupleHashTableData-entrysize.patch text/x-patch 1.6 KB
v2-0004-nodeSetOp.c-missing-additionalsize-for-BuildTuple.patch text/x-patch 1.0 KB
v2-0005-TupleHashTable-store-additional-data-along-with-t.patch text/x-patch 12.0 KB
v2-0006-simplehash-don-t-require-a-status-field.patch text/x-patch 7.1 KB
v2-0007-TupleHashTable-reduce-overhead-by-removing-status.patch text/x-patch 4.9 KB
v2-0008-Pack-TupleHashEntryData-and-AggStatePerGroupData.patch text/x-patch 1.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-11-21 20:43:58 Re: Replace current implementations in crypt() and gen_salt() to OpenSSL
Previous Message Thomas Munro 2024-11-21 20:16:48 Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)