Re: check for null value before looking up the hash function

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: check for null value before looking up the hash function
Date: 2022-05-21 16:04:45
Message-ID: CAKFQuwbFBCk+_LF59TcF16KOz9u5qAbx2UtAuju2aFfFhXbMHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 21, 2022 at 8:32 AM Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> wrote:

> Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:
>
>>
>>
>> On 5/21/22 15:06, Ranier Vilela wrote:
>> >>Zhihong Yu <zyu(at)yugabyte(dot)com> writes:
>> >>> I was looking at the code in hash_record()
>> >>> of src/backend/utils/adt/rowtypes.c
>> >>> It seems if nulls[i] is true, we don't need to look up the hash
>> function.
>> >
>> >>I don't think this is worth changing. It complicates the logic,
>> >>rendering it unlike quite a few other functions written in the same
>> >>style. In cases where the performance actually matters, the hash
>> >>function is cached across multiple calls anyway. You might save
>> >>something if you have many calls in a query and not one of them
>> >>receives a non-null input, but how likely is that?
>> >
>> > I disagree.
>> > I think that is worth changing. The fact of complicating the logic
>> > is irrelevant.
>>
>> That's a quite bold claim, and yet you haven't supported it by any
>> argument whatsoever. Trade-offs between complexity and efficiency are a
>> crucial development task, so complicating the logic clearly does matter.
>>
> What I meant is that complicating the logic in search of efficiency is
> worth it, and that's what everyone is looking for in this thread.
> Likewise, not complicating the logic, losing a little bit of efficiency,
> applied to all the code, leads to a big loss of efficiency.
> In other words, I never miss an opportunity to gain efficiency.
>
>
I disliked the fact that 80% of the patch was adding indentation. Instead,
remove indentation for the normal flow case and move the loop short-circuit
to the top of the loop where it is traditionally found.

This seems like a win on both efficiency and complexity grounds. Having
the "/* see hash_array() */" comment and logic twice is a downside but a
minor one that could be replaced with a function call if desired.

diff --git a/src/backend/utils/adt/rowtypes.c
b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..0bc28d1742 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -1838,6 +1838,13 @@ hash_record(PG_FUNCTION_ARGS)
TypeCacheEntry *typentry;
uint32 element_hash;

+ if (nulls[i])
+ {
+ /* see hash_array() */
+ result = (result << 5) - result + 0;
+ continue;
+ }
+
att = TupleDescAttr(tupdesc, i);

if (att->attisdropped)
@@ -1860,24 +1867,16 @@ hash_record(PG_FUNCTION_ARGS)
my_extra->columns[i].typentry = typentry;
}

- /* Compute hash of element */
- if (nulls[i])
- {
- element_hash = 0;
- }
- else
- {
- LOCAL_FCINFO(locfcinfo, 1);
+ LOCAL_FCINFO(locfcinfo, 1);

- InitFunctionCallInfoData(*locfcinfo,
&typentry->hash_proc_finfo, 1,
- att->attcollation, NULL, NULL);
- locfcinfo->args[0].value = values[i];
- locfcinfo->args[0].isnull = false;
- element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));
+ InitFunctionCallInfoData(*locfcinfo, &typentry->hash_proc_finfo, 1,
+ att->attcollation, NULL, NULL);
+ locfcinfo->args[0].value = values[i];
+ locfcinfo->args[0].isnull = false;
+ element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));

- /* We don't expect hash support functions to return null */
- Assert(!locfcinfo->isnull);
- }
+ /* We don't expect hash support functions to return null */
+ Assert(!locfcinfo->isnull);

/* see hash_array() */
result = (result << 5) - result + element_hash;

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-05-21 16:13:24 Re: check for null value before looking up the hash function
Previous Message Ranier Vilela 2022-05-21 15:32:20 Re: check for null value before looking up the hash function