Re: General purpose hashing func in pgbench

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: General purpose hashing func in pgbench
Date: 2018-01-18 09:06:40
Message-ID: alpine.DEB.2.20.1801180710400.11884@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Ildar,

Patch v8 applies cleanly and compiles. Global and local "make check ok".
Doc build ok.

>> For me "random seed" is what is passed to srandom, so the variable
>> should rather be named hash_seed because there could also be a random
>> seed (actually, there is in another parallel patch:-). Moreover, this
>> seed may or may not be random if set, so calling it "random_seed" is
>> not desirable.
>
> My intention was to introduce seed variable which potentially could be
> used in different contexts, not only for hash functions.

Yes, good point. I'd need it for the pseudo-random permutation function.

> I renamed it to 'hash_seed' for now. But what do you think about naming
> it simply 'seed' or choosing some other general name?

ISTM "seed" that is prone to being used for something else in a script.
What about ":default_seed", which says it all?

Some minor suggestions:

In "make_func", I'd assert that nargs is positive in the default branch.

The default seed may be created with just one assignment instead of
repeated |= ops. Or not:-)

In evalStandardFunc, I'd suggest to avoid the ? construction and use an
if and a direct setIntValue in both case, removing the "result"
variable, as done in other branches (eg bitwise operators...). Maybe
something like:

if (func == murmur2)
setIntValue(retval, getHashXXX(val, seed));
else if (...)
...
else
Assert(0);

so that it is structurally ready for other hash functions if needed.

Documentation:

In the table, use official names in the description, and add wikipedia
links, something like:

FNV hash ->
<ulink url="https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function">FNV-1a hash</ulink>

murmur2 hash ->
<ulink url="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash2 hash</ulink>

In the text:

"Hash functions accepts" -> "Hash functions
<literal>hash</literal>, <......> and <....> accept*NO S*"

"... provided hash_seed value is used" => "... provided the value of
<literal>:hash_seed</literal> is used, which is initialized randomly
unless set by the command-line <literal>-D</literal> option."

ISTM that the Automatic Variable table should be in alphabetical order.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-01-18 09:28:14 Re: [Sender Address Forgery]Re: pg_(total_)relation_size and partitioned tables
Previous Message Konstantin Knizhnik 2018-01-18 08:59:47 Re: [HACKERS] Surjective functional indexes