Re: [HACKERS] Postgres' lexer

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ansley, Michael" <Michael(dot)Ansley(at)intec(dot)co(dot)za>
Cc: "'Leon'" <leon(at)udmnet(dot)ru>, hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Postgres' lexer
Date: 1999-09-03 14:59:06
Message-ID: 12612.936370746@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Ansley, Michael" <Michael(dot)Ansley(at)intec(dot)co(dot)za> writes:
> I have a bit of a problem with reading this: a > -2 correctly,
> while not reading this: a>-2 correctly, because that implies that you are
> using the space as a precedence operator. This should be done by braces.

Not at all: this is a strictly lexical issue (where do we divide the
input into tokens) and whitespace has been considered a reasonable
lexical separator for years. Furthermore, SQL already depends on
whitespace to separate tokens that are made of letters and digits.
You can't spell "SELECT" as "SEL ECT", nor "SELECT f1" as "SELECTf1",
nor does "SELECT 1 2;" mean "SELECT 12;". So it seems perfectly
reasonable to me to use whitespace to separate operator names when
there would otherwise be ambiguity about what's meant.

> This: a > (-2) is totally unambiguous, spaces or no spaces.

True, and there's nothing to stop you from writing that style if you
prefer it.

> Perhaps there is a general case for where unary operators are allowed to
> appear, and we can use this, e.g.: they can only appear at the beginning of
> an expression, or immediately after another operator (ignoring spaces).

Don't forget about right-unary operators...

> This means that >- will be scanned as an operator if it exists, or a >
> followed by a unary minus if >- doesn't exist as an operator.

I think it would be a really bad idea for the lexical analysis to depend
on whether or not particular operator names are defined, for the same
reasons that lexical analysis of word tokens doesn't depend on whether
there are keywords/table names/field names that match those tokens.
You get into circularity problems very quickly if you do that.
Language designers learned not to do that in the sixties...

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ansley, Michael 1999-09-03 15:03:05 RE: [HACKERS] Postgres' lexer
Previous Message Thomas Lockhart 1999-09-03 14:48:30 main tree is (slightly) damaged