Re: Reduce TupleHashEntryData struct size by half

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reduce TupleHashEntryData struct size by half
Date: 2025-02-28 04:09:13
Message-ID: CAApHDvpN4v3t_sdz4dvrv1Fx_ZPw=twSnxuTEytRYP7LFz5K9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 13 Feb 2025 at 14:01, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> Attached a new patchset that doesn't change the API for
> ExecCopySlotMinimalTuple(). Instead, I'm using
> ExecFetchSlotMinimalTuple(), which avoids the extra memcpy if the slot
> is TTSOpsMinimalTuple, which is what HashAgg uses. I separated out the
> change to use ExecFetchSlotMinimalTuple() into 0004 to illustrate the
> performance impact.

Some review comments:

* my_log2() takes a "long" parameter type but transitionSpace is a
"Size". These types aren't the same width on all platforms we support.
Maybe pg_nextpower2_size_t() is a better option.

* Should the following have MAXALIGN(tupleSize) on the right side of
the expression?

tupleChunkSize = tupleSize

I understand this was missing before, but both bump.c does consume at
least MAXALIGN(<req_size>) in BumpAlloc().

* while (16 * maxBlockSize > work_mem * 1024L)

The 1024L predates the change made in 041e8b95. "1024L" needs to be
replaced with "(Size) 1024"

Maybe you could just replace the while loop and the subsequent "if" check with:

/*
* Like CreateWorkExprContext(), use smaller sizings for smaller work_mem,
* to avoid large jumps in memory usage.
*/
maxBlockSize = pg_prevpower2_size_t(work_mem * (Size) 1024 / 16);

/* But no bigger than ALLOCSET_DEFAULT_MAXSIZE */
maxBlockSize = Min(maxBlockSize, ALLOCSET_DEFAULT_MAXSIZE);

/* and no smaller than ALLOCSET_DEFAULT_INITSIZE */
maxBlockSize = Max(maxBlockSize, ALLOCSET_DEFAULT_INITSIZE);

I believe that gives the same result without the looping.

* In hash_create_memory(), can you get rid of the minContextSize and
initBlockSize variables?

* Is it worth an Assert() theck additionalsize > 0?

* The amount of space available is the additionalsize requested in the call
* to BuildTupleHashTable(). If additionalsize was specified as zero, no
* additional space is available and this function should not be called.
*/
static inline void *
TupleHashEntryGetAdditional(TupleHashTable hashtable, TupleHashEntry entry)
{
return (char *) entry->firstTuple - hashtable->additionalsize;
}

Benchmarks:

I was also experimenting with the performance of this using the
following test case:

create table c (a int not null);
insert into c select a from generate_Series(1,1000) a;
vacuum freeze analyze c;

Query: select a,count(*) from c group by a;

Average TPS over 10x 10 second runs with -M prepared

master: 3653.9853988
v7-0001: 3741.815979
v7-0001+0002: 3737.4313064
v7-0001+0002+0003: 3834.6271706
v7-0001+0002+0004+0004: 3912.1158887

I also tried out changing hash_agg_check_limits() so that it no longer
calls MemoryContextMemAllocated and instead uses ->mem_allocated
directly and with all the other patches got:

v7-0001+0002+0004+0004+extra: 4049.0732381

We probably shouldn't do exactly that as it be better not to access
that internal field from outside the memory context code. A static
inline function in memutils.h that handles the non-recursive callers
might be nice.

I've attached my results of running your test in graph form. I also
see a small regression for these small scale tests. I wondering if
it's worth mocking up some code to see what the performance would be
like without the additional memcpy() that's new to
LookupTupleHashEntry_internal(). How about hacking something up that
adds an additionalsize field to TupleDesc and then set that field to
your additional size and have heap_form_minimal_tuple() allocate that
much extra memory?

David

Attachment Content-Type Size
image/png 60.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-02-28 04:15:17 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Noah Misch 2025-02-28 04:01:43 Re: Licence preamble update