Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, tomas(at)vondra(dot)me, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date: 2025-03-31 02:56:56
Message-ID: CAApHDvr6HOdTo1-ie4Gwq9XS+z6G5Xih8Ravt+XppqK1zhg5AA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 29 Mar 2025 at 05:46, Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> PFA patches. 0001 and 0002 are the same as the previous set. 0003
> changes the initial hash table size to the length of ec_derives.

I'm just not following the logic in making it the length of the
ec_derives List. If you have 32 buckets and try to insert 32 elements,
you're guaranteed to need a resize after inserting 28 elements. See
the grow_threshold logic SH_UPDATE_PARAMETERS(). The point of making
it 64 was to ensure the table is never unnecessarily sparse and to
also ensure we make it at least big enough for the minimum number of
ec_derives that we're about to insert.

Looking more closely at the patch's ec_add_clause_to_derives_hash()
function, I see you're actually making two hash table entries for each
RestrictInfo, so without any em_is_const members, you'll insert 64
entries into the hash table with a ec_derives list of 32, in which
case 64 buckets isn't enough and the table will end up growing to 128
elements.

I think you'd be better off coming up with some logic like putting the
lowest pointer value'd EM first in the key and ensure that all lookups
do that too by wrapping the lookups in some helper function. That'll
half the number of hash table entries at the cost of some very cheap
comparisons.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2025-03-31 03:04:18 Re: Memoize ANTI and SEMI JOIN inner
Previous Message Alena Rybakina 2025-03-31 02:50:07 Re: Memoize ANTI and SEMI JOIN inner