From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: scanner/parser minimization |
Date: | 2013-03-02 18:47:48 |
Message-ID: | 51324954.9060004@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02.03.2013 17:09, Tom Lane wrote:
> Greg Stark<stark(at)mit(dot)edu> writes:
>> Regarding yytransition I think the problem is we're using flex to
>> implement keyword recognition which is usually not what it's used for.
>> Usually people use flex to handle syntax things like quoting and
>> numeric formats. All identifiers are handled by flex as equivalent.
>> Then the last step in the scanner for identifiers is to look up the
>> identifier in a hash table and return the keyword token if it's a
>> keyword. That would massively simplify the scanner tables.
>
> Uh ... no. I haven't looked into why the flex tables are so large,
> but this theory is just wrong. See ScanKeywordLookup().
Interestingly, the yy_transition array generated by flex used to be much
smaller:
8.3: 22072 elements
8.4: 62623 elements
master: 64535 elements
The big jump between 8.3 and 8.4 was caused by introduction of the
unicode escapes: U&'foo' [UESCAPE 'x'] . And in particular, the "error
rule" for the UESCAPE, which we use to avoid backtracking.
I experimented with a patch that uses two extra flex states to shorten
the error rules, see attached. The idea is that after lexing a unicode
literal like "U&'foo'", you enter a new state, in which you check
whether an "UESCAPE 'x'" follows. This slashes the size of the array to
36581 elements.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
smaller-uescape-error-rules.patch | text/x-diff | 3.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | James Cloos | 2013-03-02 23:11:31 | Re: Floating point error |
Previous Message | Tom Lane | 2013-03-02 15:54:24 | Re: scanner/parser minimization |