From: | John Naylor <jcnaylor(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Date: | 2018-12-27 16:19:51 |
Message-ID: | CAJVSVGUUOTd2_Rv46FTcy9Yp6+xMwo0BwdSvPWCaLia_tesQjA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/26/18, John Naylor <jcnaylor(at)gmail(dot)com> wrote:
> On 12/26/18, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I wonder if we could do something really simple like a lookup based on
>> the first character of the scan keyword. It looks to me like there are
>> 440 keywords right now, and the most common starting letter is 'c',
>> which is the first letter of 51 keywords. So dispatching based on the
>> first letter clips at least 3 steps off the binary search. I don't
>> know whether that's enough to be worthwhile, but it's probably pretty
>> simple to implement.
> I agree it'd be fairly simple to do, and might raise the bar for doing
> anything more complex.
I went ahead and did this for v4, but split out into a separate patch.
In addition, I used a heuristic to bypass binary search for the most
common keywords. Normally, the middle value is computed
mathematically, but I found that in each range of keywords beginning
with the same letter, there is often 1 or 2 common keywords that are
good first guesses, such as select, from, join, limit, where. I taught
the lookup to try those first, and then compute subsequent steps the
usual way.
Barring additional bikeshedding on 0001, I'll plan on implementing
offset-based lookup for the other keyword types and retire the old
ScanKeyword. Once that's done, we can benchmark and compare with the
optimizations in 0002.
-John Naylor
Attachment | Content-Type | Size |
---|---|---|
v4-0001-WIP-Use-offset-based-keyword-lookup.patch | text/x-patch | 22.8 KB |
v4-0002-Dispatch-keyword-lookup-on-the-first-character.patch | text/x-patch | 6.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hugh Ranalli | 2018-12-27 16:21:52 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Magnus Hagander | 2018-12-27 14:55:58 | Re: Offline enabling/disabling of data checksums |