Re: Error when using array_agg with filter where clause in pg16 and pg17

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kaimeh <kkaimeh(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Error when using array_agg with filter where clause in pg16 and pg17
Date: 2025-04-08 15:32:28
Message-ID: 4040920.1744126348@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Kaimeh <kkaimeh(at)gmail(dot)com> writes:
> In postgresql 16 and 17 using array_agg with filter where gives an error,
> while in postgres 15 exact same query works.

> This is minimal sample for reproducing:

> create table test (id int, data jsonb);
> insert into test (id, data) values
> (1, '{"a": null}'),
> (2, '{"a": "2"}'),
> (3, '{"a": "2"}'),
> (4, '{"a": ""}');
> select array_agg(distinct (data->>'a')::int) filter (where data->>'a' is
> not null and data->>'a' != '') from test;

Ugh. EXPLAIN tells the tale:

Aggregate (cost=113.57..113.58 rows=1 width=32)
Output: array_agg(DISTINCT (((data ->> 'a'::text))::integer)) FILTER (WHERE (((data ->> 'a'::text) IS NOT NULL) AND ((data ->> 'a'::text) <> ''::text)))
-> Sort (cost=88.17..91.35 rows=1270 width=32)
Output: data, (((data ->> 'a'::text))::integer)
Sort Key: (((test.data ->> 'a'::text))::integer)
-> Seq Scan on public.test (cost=0.00..22.70 rows=1270 width=32)
Output: data, ((data ->> 'a'::text))::integer

We have pushed the array_agg argument down in order to sort by it,
neglecting the fact that there's a filter clause that should prevent
evaluation failures.

Bisecting fingers this commit:

1349d2790bf48a4de072931c722f39337e72055e is the first bad commit
commit 1349d2790bf48a4de072931c722f39337e72055e
Author: David Rowley <drowley(at)postgresql(dot)org>
Date: Tue Aug 2 23:11:45 2022 +1200

Improve performance of ORDER BY / DISTINCT aggregates

Fortunately, that commit didn't actually rip out the old code path.
The simplest fix I can think of is to disable the presorted-agg
optimization if (1) there's a FILTER clause and (2) the proposed
sort key is anything more complex than a Var. There might be
some wiggle room in (2) -- for instance, RelabelType(Var) should
be safe -- but we don't have a lot of intelligence about which
expression types are guaranteed error-free.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2025-04-08 23:55:55 Re: Error when using array_agg with filter where clause in pg16 and pg17
Previous Message David G. Johnston 2025-04-08 14:44:56 Re: BUG #18884: The CURRENT_TIMESTAMP value being returned in UTC-2.