Re: strange IS NULL behaviour

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange IS NULL behaviour
Date: 2013-07-04 16:26:50
Message-ID: 20130704162650.GC17790@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 09:29:19PM +0100, Hannu Krosing wrote:
> On 11/26/2012 09:05 PM, Tom Lane wrote:
> >Hannu Krosing <hannu(at)2ndQuadrant(dot)com> writes:
> >>In some previous mail Tom Lane claimed that by SQL standard
> >>either an array of all NULLs or a record with all fields NULLs (I
> >>don't remember which) is also considered NULL. If this is true,
> >>then an empty array - which can be said to consist of nothing
> >>but NULLs - should itself be NULL.
> >What I think you're referring to is that the spec says that "foo IS
> >NULL" should return true if foo is a record containing only null fields.
> Is this requirement recursive ?
>
> That is , should
>
> ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL
> also be true ?
>
> Currently PostgreSQL does this kind of IS NULL for "simple" rows
>
> hannu=# SELECT ROW(NULL, NULL) IS NULL;
> ?column?
> ----------
> t
> (1 row)
>
> and also for first level row types
>
> hannu=# SELECT ROW(NULL, ROW(NULL, NULL)) IS NULL;
> ?column?
> ----------
> t
> (1 row)
>
> but then mysteriously stops working at third level
>
> hannu=# SELECT ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL;
> ?column?
> ----------
> f
> (1 row)

I finally had time to look at this, and it is surprising. I used
EXPLAIN VERBOSE to see what the optimizer was outputting:

EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: true

EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: (ROW(NULL::unknown) IS NULL)

EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
QUERY PLAN
---------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: (ROW(ROW(NULL::unknown)) IS NULL)

The first test outputs a constant, 'true'. The second test is
ROW(NULL::unknown) because the inner ROW(NULL) was converted to
NULL:unknown. The third one, which returns false (wrong), happens
because you have ROW embedded in ROW, which the optimizer can't process,
and the executor can't either.

I developed the attached patch which properly recurses into ROW()
records checking for NULLs; you can see it returns the right answer in
all cases (and constant folds too):

test=> EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: true
(2 rows)

test=> EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: true
(2 rows)

test=> EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
--> Output: true
(2 rows)

You might think the problem is only with constants, but it extends to
column values too (non-patched server):

CREATE TABLE test (x INT);
CREATE TABLE

INSERT INTO test VALUES (1), (NULL);
INSERT 0 2

SELECT ROW(x) IS NULL FROM test;
?column?
----------
f
t

SELECT ROW(ROW(x)) IS NULL FROM test;
?column?
----------
f
t

SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
?column?
----------
--> f
--> f

With the patch, that works too:

SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
?column?
----------
f
t

The optimizer seems like the right place to fix this, per my patch. It
already flattens IS NULL tests into a series of AND clauses, and now by
recursing, it handles nested ROW values properly too.

This fix would be for head only.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
isnull.diff text/x-diff 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-07-04 17:48:35 Re: Review: Patch to compute Max LSN of Data Pages
Previous Message Antonin Houska 2013-07-04 16:11:47 Re: LATERAL quals revisited