Re: scanner/parser minimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: scanner/parser minimization
Date: 2013-02-28 21:09:11
Message-ID: 12314.1362085751@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> A whole lot of those state transitions are attributable to states
> which have separate transitions for each of many keywords.

Yeah, that's no surprise.

The idea that's been in the back of my mind for awhile is to try to
solve the problem at the lexer level not the parser level: that is,
have the lexer return IDENT whenever a keyword appears in a context
where it should be interpreted as unreserved. You suggest somehow
driving that off mid-rule actions, but that seems to be to be probably
a nonstarter from a code complexity and maintainability standpoint.

I believe however that it's possible to extract an idea of which
tokens the parser believes it can see next at any given parse state.
(I've seen code for this somewhere on the net, but am too lazy to go
searching for it again right now.) So we could imagine a rule along
the lines of "if IDENT is allowed as a next token, and $KEYWORD is
not, then return IDENT not the keyword's own token".

This might be unworkable from a speed standpoint, depending on how
expensive it is to make the determination about allowable next symbols.
But it seems worth looking into.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2013-02-28 21:15:33 Re: sql_drop Event Trigger
Previous Message Robert Haas 2013-02-28 20:34:32 scanner/parser minimization