| From: | John Naylor <jcnaylor(at)gmail(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) | 
| Date: | 2018-12-21 23:38:48 | 
| Message-ID: | CAJVSVGXsogxLCphCW1=PQc0m5PFjnpx9Wc98UfmSXZM0inpv2g@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 12/20/18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'd be inclined to put the script in src/tools, I think.  IMO src/common
> is for code that actually gets built into our executables.
Done.
>> which takes
>> pl_unreserved_kwlist.h as input and outputs
>> pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h.
>
> I wonder whether we'd not be better off producing just one output
> file, in which we have the offsets emitted as PG_KEYWORD macros
> and then the giant string emitted as a macro definition, ie
> something like
>
> #define PG_KEYWORD_STRING \
> 	"absolute\0" \
> 	"alias\0" \
> 	...
>
> That simplifies the Makefile-hacking, at least, and it possibly gives
> callers more flexibility about what they actually want to do with the
> string.
Okay, I tried that. Since the script is turning one header into
another, I borrowed the "*_d.h" nomenclature from the catalogs. Using
a single file required some #ifdef hacks in the output file. Maybe
there's a cleaner way to do this, but I don't know what it is.
Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.
> I'm for "not change things unnecessarily".  People might well be
> scraping the keyword list out of parser/kwlist.h for other purposes
> right now --- indeed, it's defined the way it is exactly to let
> people do that.  I don't see a good reason to force them to redo
> whatever tooling they have that depends on that.  So let's build
> kwlist_offsets.h alongside that, but not change kwlist.h itself.
Done.
>> I used the global .gitignore, but maybe that's an abuse of it.
>
> Yeah, I'd say it is.
Moved.
>> +# TODO: Error out if the keyword names are not in ASCII order.
>
> +many for including such a check.
Done.
> Also note that we don't require people to have Perl installed when
> building from a tarball.  Therefore, these derived headers must get
> built during "make distprep" and removed by maintainer-clean but
> not distclean.  I think this also has some implications for VPATH
> builds, but as long as you follow the pattern used for other
> derived header files (e.g. fmgroids.h), you should be fine.
Done. I also blindly added support for MSVC.
-John Naylor
| Attachment | Content-Type | Size | 
|---|---|---|
| v2-wip-use-offset-based-scankeywords.patch | text/x-patch | 22.0 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2018-12-21 23:42:57 | Re: Clean up some elog messages and comments for do_pg_stop_backup and do_pg_start_backup | 
| Previous Message | Michael Paquier | 2018-12-21 23:38:19 | Re: could recovery_target_timeline=latest be the default in standby mode? |