From: | John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> |
---|---|
To: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: speed up unicode normalization quick check |
Date: | 2020-05-30 06:52:24 |
Message-ID: | CACPNZCsWu2XSLuqFfdye8GrL+89QjfL7+i1Bp02H-2Z5k49v6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, May 30, 2020 at 12:13 AM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
>
> I forgot in my first round of code review to mention, "thanks for the patch". I generally like what you are doing here, and am trying to review it so it gets committed.
And I forgot to say thanks for taking a look!
> The reason I gave this feedback is that I saved the *kwlist_d.h files generated before applying the patch, and compared them with the same files generated after applying the patch, and noticed a very slight degradation. Most of the files changed without any expansion, but the largest of them, src/common/kwlist_d.h, changed from
>
> static const int16 h[901]
>
> to
>
> static const int16 h[902]
Interesting, I hadn't noticed. With 450 keywords, we need at least 901
elements in the table. Since 901 is divisible by the new hash
multiplier 17, this gets triggered:
# However, it would be very bad if $nverts were exactly equal to either
# $hash_mult1 or $hash_mult2: effectively, that hash function would be
# sensitive to only the last byte of each key. Cases where $nverts is a
# multiple of either multiplier likewise lose information. (But $nverts
# can't actually divide them, if they've been intelligently chosen as
# primes.) We can avoid such problems by adjusting the table size.
while ($nverts % $hash_mult1 == 0
|| $nverts % $hash_mult2 == 0)
{
$nverts++;
}
This is harmless, and will go away next time we add a keyword.
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Adrien Nayrat | 2020-05-30 08:23:30 | Re: pg_dump fail to not dump public schema orders |
Previous Message | Pavel Stehule | 2020-05-30 05:40:39 | Re: Inlining of couple of functions in pl_exec.c improves performance |