Re: AW: Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards

From: Tom Ivar Helbekkmo <tih(at)kpnQwest(dot)no>
To: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: AW: Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Stand ards
Date: 2001-06-07 10:30:20
Message-ID: 86ae3kr39f.fsf@athene.i.eunet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at> writes:

> Actually I am not sure whether the column = NULL syntax is even defined
> or allowed in SQL92 (e.g. Informix interprets the NULL as column name in
> this context and errs out).

I don't have the standard handy, but I do have Joe Celko's book, "Data
& Databases: Concepts in Practice". He says (in section 8.2, under
the heading "Multivalued Logic"):

A NULL cannot be compared to another NULL or to a value
with what Dr. Codd called a theta operator and what
programmers call a comparison operator (equal, not equal,
less than, greater than, and so forth). This results in
a three-valued logic, which has an UNKNOWN in addition
to TRUE and FALSE. [...] UNKNOWN is a logical value and
not the same as a NULL, which is a data value. That is
why you have to say X IS [NOT] NULL in SQL and not use
X = NULL instead. Theta operators are expressions of the
form X <comp op> Y; when X or Y or both are NULL, theta
operators will return an UNKNOWN and not a NULL.

He goes on to explain three-valued logic in more detail, showing truth
tables according to Jan Lukasiewicz (the inventor of RPN), and says,
of SQL-92, that it "is comforting to see that [it has] the same truth
tables as the three-valued system of Lukasiewicz". Further, he says:

SQL-92 added a new predicate of the form

<search condition> IS [NOT] TRUE | FALSE | UNKNOWN

which will let you map any combination of three-valued
logic to the two Boolean values.

A quick test run with psql shows that PostgreSQL does not properly
implement three-valued logic: it does not recognize the UNKNOWN
keyword alongside TRUE and FALSE, in any situation. It will also
return boolean truth values for comparisons with NULL values, using
them as "real" data values in the comparison. Worse (IMHO), this is
not consistent: while a test for "column = NULL" will return rows
where that is true, and a test for "not column = NULL" will return the
rest, "column <> NULL" returns no rows! This means that the theta
operators are not all treated the same way, which is surely wrong!

It seems to me that the idea of NULL as an unkown data value and
UNKNOWN as the corresponding truth value, combined with the rules for
propagation of NULL in mathematical operations, of UNKNOWN in truth
operations, and from NULL to UNKNOWN by theta operators, is a very
clean, intuitive way of handling these issues. It feels right! :-)

-tih
--
The basic difference is this: hackers build things, crackers break them.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Ivar Helbekkmo 2001-06-07 11:08:17 Re: 7.2 items
Previous Message Tom Lane 2001-06-07 10:00:30 Re: behavior of ' = NULL' vs. MySQL vs. Standards