From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division] |
Date: | 2013-06-24 02:50:13 |
Message-ID: | 15895.1372042213@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David Fetter <david(at)fetter(dot)org> writes:
> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote:
>> I think it is OK if that gets a syntax error. If that's the "worst
>> case" I like this approach.
> I think reducing the usefulness of error messages is something we need
> to think extremely hard about before we do. Is there maybe a way to
> keep the error messages even if by some magic we manage to unreserve
> the words?
Of the alternatives discussed so far, I don't really like anything
better than adding another special case to base_yylex(). Robert opined
in the other thread about RESPECT NULLS that the potential failure cases
with that approach are harder to reason about, which is true, but that
doesn't mean that we should accept failures we know of because there
might be failures we don't know of.
One thing that struck me while thinking about this is that it seems
like we've been going about it wrong in base_yylex() in any case.
For example, because we fold WITH followed by TIME into a special token
WITH_TIME, we will fail on a statement beginning
WITH time AS ...
even though "time" is a legal ColId. But suppose that instead of
merging the two tokens into one, we just changed the first token into
something special; that is, base_yylex would return a special token
WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above
problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading
keyword of a statement; and similarly for the few other places where
WITH can be followed by an arbitrary identifier.
Going on the same principle, we could probably let FILTER be an
unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
type_func_name_keyword. (I've not tried this though.)
This idea doesn't help much for OVER because one of the alternatives for
over_clause is "OVER ColId", and I doubt we want to have base_yylex know
all the alternatives for ColId. I also had no great success with the
NULLS FIRST/LAST case: AFAICT the substitute token for NULLS still has
to be fully reserved, meaning that something like "select nulls last"
still doesn't work without quoting. We could maybe fix that with enough
denormalization of the index_elem productions, but it'd be ugly.
> Could well. I suspect we may need to rethink the whole way we do
> grammar at some point, but that's for a later discussion when I (or
> someone else) has something choate to say about it.
It'd sure be interesting to know what the SQL committee's target parsing
algorithm is. I find it hard to believe they're uninformed enough to
not know that these random syntaxes they keep inventing are hard to deal
with in LALR(1). Or maybe they really don't give a damn about breaking
applications every time they invent a new reserved word?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Gierth | 2013-06-24 03:00:50 | Naming of ORDINALITY column (was: Re: Review: UNNEST (and other functions) WITH ORDINALITY) |
Previous Message | Michael Paquier | 2013-06-23 22:46:34 | Re: Support for REINDEX CONCURRENTLY |