Contradictory behavior of array_agg(distinct) aggregate.

From: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Contradictory behavior of array_agg(distinct) aggregate.
Date: 2024-12-04 06:44:07
Message-ID: 7d3268df-abc2-4eb3-8e2b-59df084b579a@garret.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

Is it only me who consider that current behavior of array_agg(distinct)
contradicts to interpretation of nulls in other cases ("null" is
something like "unknown" which means that we can not say weather two
nulls are the same or not). This is why it is allowed to insert multiple
nulls in the unique column.

postgres=# create table t(x integer unique);
CREATE TABLE
postgres=# insert into t values (null),(null);
INSERT 0 2
postgres=# select count(distinct x) from t;
 count
-------
     0
(1 row)

postgres=# select array_agg(distinct x) from t;
 array_agg
-----------
 {NULL}
(1 row)

postgres=# select array_agg(x) from t;
  array_agg
-------------
 {NULL,NULL}
(1 row)

So what is the number of distinct "x" values in this case? I think that
according to SQL model - 0 (as count(distinct) returns).
Why in this case array_agg(distinct x) returns non-empty array?

Yes, unlike most other aggregates, `array_agg` is not ignoring null values.
But is it correct to treat two null values as the same (non-distinct)?
IMHO correct result in this case should be either {} or NULL, either
{NULL,NULL}.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Pyhalov 2024-12-04 06:59:10 Re: Partition-wise join with whole row vars
Previous Message Zhijie Hou (Fujitsu) 2024-12-04 06:42:55 RE: Memory leak in WAL sender with pgoutput (v10~)