From: | John Naylor <jcnaylor(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) |
Date: | 2018-12-21 23:38:48 |
Message-ID: | CAJVSVGXsogxLCphCW1=PQc0m5PFjnpx9Wc98UfmSXZM0inpv2g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/20/18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'd be inclined to put the script in src/tools, I think. IMO src/common
> is for code that actually gets built into our executables.
Done.
>> which takes
>> pl_unreserved_kwlist.h as input and outputs
>> pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h.
>
> I wonder whether we'd not be better off producing just one output
> file, in which we have the offsets emitted as PG_KEYWORD macros
> and then the giant string emitted as a macro definition, ie
> something like
>
> #define PG_KEYWORD_STRING \
> "absolute\0" \
> "alias\0" \
> ...
>
> That simplifies the Makefile-hacking, at least, and it possibly gives
> callers more flexibility about what they actually want to do with the
> string.
Okay, I tried that. Since the script is turning one header into
another, I borrowed the "*_d.h" nomenclature from the catalogs. Using
a single file required some #ifdef hacks in the output file. Maybe
there's a cleaner way to do this, but I don't know what it is.
Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.
> I'm for "not change things unnecessarily". People might well be
> scraping the keyword list out of parser/kwlist.h for other purposes
> right now --- indeed, it's defined the way it is exactly to let
> people do that. I don't see a good reason to force them to redo
> whatever tooling they have that depends on that. So let's build
> kwlist_offsets.h alongside that, but not change kwlist.h itself.
Done.
>> I used the global .gitignore, but maybe that's an abuse of it.
>
> Yeah, I'd say it is.
Moved.
>> +# TODO: Error out if the keyword names are not in ASCII order.
>
> +many for including such a check.
Done.
> Also note that we don't require people to have Perl installed when
> building from a tarball. Therefore, these derived headers must get
> built during "make distprep" and removed by maintainer-clean but
> not distclean. I think this also has some implications for VPATH
> builds, but as long as you follow the pattern used for other
> derived header files (e.g. fmgroids.h), you should be fine.
Done. I also blindly added support for MSVC.
-John Naylor
Attachment | Content-Type | Size |
---|---|---|
v2-wip-use-offset-based-scankeywords.patch | text/x-patch | 22.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2018-12-21 23:42:57 | Re: Clean up some elog messages and comments for do_pg_stop_backup and do_pg_start_backup |
Previous Message | Michael Paquier | 2018-12-21 23:38:19 | Re: could recovery_target_timeline=latest be the default in standby mode? |