RE: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values

From: "Narayanan Iyer" <nars(at)yottadb(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)lists(dot)postgresql(dot)org>, <support(at)yottadb(dot)com>
Subject: RE: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values
Date: 2021-10-08 17:34:37
Message-ID: 0f6701d7bc6a$c52e7370$4f8b5a50$@yottadb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom, thank you for the detailed response. Did not realize t1.* creates a ROW(NULL) value whereas t1.lastName does not. That explains the difference.

I also did not know the ::record syntax. Nice tool. Thanks.

I agree it is not a bug. Please close the issue.

Narayanan.

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Friday, October 8, 2021 12:35 PM
To: Narayanan Iyer <nars(at)yottadb(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org; support(at)yottadb(dot)com
Subject: Re: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values

"Narayanan Iyer" <nars(at)yottadb(dot)com> writes:
> In the below example, t1 points to a table with just 1 column (lastName) and so I expect the 2 SELECT queries (pasted below) using t1.lastName or t1.* syntax to produce the exact same results. But the latter produces one extra row of output (3 rows vs 2 rows).

I don't see any bug here. If you take out the grouping it's a bit easier
to see what's going on:

SELECT *, t1.*::record FROM tmp t2 LEFT JOIN (SELECT lastName FROM tmp) AS t1 ON (t2.firstName <= 'Acid');
id | firstname | lastname | lastname | t1
----+-----------+----------+----------+--------
1 | Acid | Burn | Burn | (Burn)
1 | Acid | Burn | | ()
2 | Joey | | |
(3 rows)

The first row comes from joining t2 Acid,Burn to t1 Acid,Burn.
The second row comes from joining t2 Acid,Burn to t1 Joey,NULL.
The third row comes from failing to join t2 Joey,NULL to anything.

So in the second row, there is a matched t1 row, and t1.*
therefore represents a non-null composite value that happens
to contain one null field. In the third row, there is no
matched t1 row at all, so we consider that t1.* is a composite NULL,
which is different from a composite containing NULL fields.

You can argue about whether composite NULL ought to be considered
identical to ROW(NULL). But there are a lot of contexts where
that would be a bad idea, and very few where it'd be good.

> It seems like a Postgres bug to me. Not sure if there is a Postgres setting that I need to enable in order to avoid this discrepancy.

You might be able to get the results you want by explicitly
converting to row-containing-nulls, which is not very hard:

SELECT *, row(t1.*) FROM tmp t2 LEFT JOIN (SELECT lastName FROM tmp) AS t1 ON (t2.firstName <= 'Acid');
id | firstname | lastname | lastname | row
----+-----------+----------+----------+--------
1 | Acid | Burn | Burn | (Burn)
1 | Acid | Burn | | ()
2 | Joey | | | ()
(3 rows)

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Narayanan Iyer 2021-10-08 17:37:19 RE: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values
Previous Message David G. Johnston 2021-10-08 16:50:54 Re: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values