From: | John Naylor <jcnaylor(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Date: | 2018-12-23 17:51:57 |
Message-ID: | CAJVSVGVM5tt9n9is=zg4dgNAPUWg-MZ3eta-BOhA57_yxZc1kQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/22/18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> John Naylor <jcnaylor(at)gmail(dot)com> writes:
>> Using a single file also gave me another idea: Take value and category
>> out of ScanKeyword, and replace them with an index into another array
>> containing those, which will only be accessed in the event of a hit.
>> That would shrink ScanKeyword to 4 bytes (offset, index), further
>> increasing locality of reference. Might not be worth it, but I can try
>> it after moving on to the core scanner.
>
> I like that idea a *lot*, actually, because it offers the opportunity
> to decouple this mechanism from all assumptions about what the
> auxiliary data for a keyword is.
Okay, in that case I went ahead and did it for WIP v3.
> (it'd be a good idea to have a switch that allows specifying the
> prefix of these constant names).
Done as an optional switch, and tested, but not yet used in favor of
the previous method as a fallback. I'll probably do it in the final
version to keep lines below 80, and to add 'core_' to the core keyword
vars.
> /* Payload data for keywords */
> typedef struct MyKeyword
> {
> int16 value;
> int16 category;
> } MyKeyword;
I tweaked this a bit to
typedef struct ScanKeywordAux
{
int16 value; /* grammar's token code */
char category; /* see codes above */
} ScanKeywordAux;
It seems that category was only 2 bytes to make ScanKeyword a power of
2 (of course that was on 32 bit machines and doesn't hold true
anymore). Using char will save another few hundred bytes in the core
scanner. Since we're only accessing this once per identifier, we may
not need to worry so much about memory alignment.
> Aside from being arguably better from the locality-of-reference
> standpoint, this gets us out of the weird ifdef'ing you've got in
> the v2 patch. The kwlist_d.h headers can be very ordinary headers.
Yeah, that's a nice (and for me unexpected) bonus.
-John Naylor
Attachment | Content-Type | Size |
---|---|---|
v3-wip-use-offset-based-scankeywords.patch | text/x-patch | 22.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2018-12-23 20:14:50 | Re: Function to track shmem reinit time |
Previous Message | Darafei Komяpa Praliaskouski | 2018-12-23 16:23:33 | Re: Joins on TID |