Precedence of standard comparison operators

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Precedence of standard comparison operators
Date: 2015-02-19 15:48:34
Message-ID: 12603.1424360914@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

My Salesforce colleagues have been bugging me about this topic, and
since I see in a nearby thread that we may be about to break backwards
compatibility on "=>", maybe it's time to do something about this too.
To wit, that the precedence of <= >= and <> is neither sane nor standards
compliant.

Up to now, Postgres has had special precedence rules for = < and >,
but their multi-character brethren just get treated as general Op
tokens. This has assorted surprising consequences; for example you
can do this:

regression=# select * from testjsonb where j->>'space' < j->>'node';

but not this:

regression=# select * from testjsonb where j->>'space' <= j->>'node';
ERROR: operator does not exist: text <= jsonb
LINE 1: select * from testjsonb where j->>'space' <= j->>'node';
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.

Of course the latter happens because ->> and <= have identical
precedence so the construct is parsed as
((j ->> 'space') <= j) ->> 'node'
whereas < has lower precedence than user-defined operators so
the first case parses in the expected fashion.

I claim that this behavior is contrary to spec as well as being
unintuitive. Following the grammar productions in SQL99:

<where clause> ::= WHERE <search condition>

<search condition> ::=
<boolean value expression>

<boolean value expression> ::=
<boolean term>
| <boolean value expression> OR <boolean term>

<boolean term> ::=
<boolean factor>
| <boolean term> AND <boolean factor>

<boolean factor> ::=
[ NOT ] <boolean test>

<boolean test> ::=
<boolean primary> [ IS [ NOT ] <truth value> ]

<truth value> ::=
TRUE
| FALSE
| UNKNOWN

<boolean primary> ::=
<predicate>
| <parenthesized boolean value expression>
| <nonparenthesized value expression primary>

<parenthesized boolean value expression> ::=
<left paren> <boolean value expression> <right paren>

<predicate> ::=
<comparison predicate>
| <between predicate>
| <in predicate>
| <like predicate>
| <null predicate>
| <quantified comparison predicate>
| <exists predicate>
| <unique predicate>
| <match predicate>
| <overlaps predicate>
| <similar predicate>
| <distinct predicate>
| <type predicate>

<comparison predicate> ::=
<row value expression> <comp op> <row value expression>

<comp op> ::=
<equals operator>
| <not equals operator>
| <less than operator>
| <greater than operator>
| <less than or equals operator>
| <greater than or equals operator>

<row value expression> ::=
<row value special case>
| <row value constructor>

<contextually typed row value expression> ::=
<row value special case>
| <contextually typed row value constructor>

<row value special case> ::=
<value specification>
| <value expression>

So both the examples I gave should be understood as <row value special
case> value expressions related by <comp op>s. This line of reasoning
says that any non-boolean operator should bind tighter than the six
standard comparison operators, because it will necessarily be part of a
<value expression> component of a boolean expression.

We have that right for = < > but not for the other three standard-mandated
comparison operators. I think we should change the grammar so that all
six act like < > do now, that is, they should have %nonassoc precedence
just above NOT.

Another thought, looking at this closely, is that we have the precedence
of IS tests (IS NOT NULL etc) wrong as well: they should bind less tightly
than user-defined ops, not more so. I'm less excited about changing that,
but there's certainly room to argue that it's wrong per spec. It's
definitely weird that the IS tests bind more tightly than multicharacter
Ops but less tightly than + - * /.

I've not really experimented with this at all; it would be useful for
example to see how many regression tests break as a gauge for how
troublesome such changes would be. I thought I'd ask whether there's
any chance at all of such a change getting accepted before doing any
serious work on it.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2015-02-19 16:01:33 Re: Allow "snapshot too old" error, to prevent bloat
Previous Message Petr Jelinek 2015-02-19 15:06:48 Re: proposal: disallow operator "=>" and use it for named parameters