Re: Document NULL

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(at)eisentraut(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Document NULL
Date: 2024-11-22 19:00:28
Message-ID: 505f9dfcc6e84a442c6fbdf600ab1df1e5030603.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2024-06-28 at 13:39 -0700, David G. Johnston wrote:
> The attached are complete and ready for review.  I did some file
> structure reformatting at the end and left that as the second patch. 
> The first contains all of the content.

I read through v4-0001. Thank you for working on this!

I really like the overall feel of the document: outlines the various
interpretations, behaviors, nuances and rationales; practical and not
philosophical.

Comments:

1.

One idea is to have a brief guidance section to help users know how to
use nulls in their data model effectively. For instance, if you allow
nulls for middle_name to mean "no middle name", then you have to be
careful when concatenating it as part of a larger string (otherwise it
will make the entire result null). Using COALESCE() can be a good
strategy here.

2.

It would be helpful to go through a combined example that shows how
these varous behaviors interact. For instance:

SELECT r.a, SUM(s.b)
FROM r LEFT JOIN s ON r.id = s.id
GROUP BY r.a HAVING SUM(s.b) < 123;

Assume that there are no null values in r or s, and there's one record
in r with no match in s. First, a null value comes into existence from
the outer join when there's no match. Then, the GROUP BY creates a
group with a single null value. Then the SUM aggregates it and returns
null. Then the less-than expression evaluates to null (due to 3VL),
then the HAVING clause excludes the record because it's distinct from
true. That's probably not what the user intended -- the sum of no
records is intuitively less than 123.

3. "...more formally, the Law of the Excluded Middle does not hold:
i.e., p OR NOT(p) != true; for all p."

Switching to formal language here is confusing (and wrong, I think). I
suggest rewording and I don't think you need formal language here: what
you are highlighting is that, when p is the null value, the expression
"p OR NOT p" evaluates to null, which is surprising to someone who is
used to thinking in 2VL.

4. COUNT() with no input is a special case that returns zero, and I
think that's worth mentioning somewhere.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-11-22 19:21:53 Re: Allow non-superuser to cancel superuser tasks.
Previous Message Nathan Bossart 2024-11-22 18:43:45 Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.