Re: Bugreport: incorrect result in 16.1 in query with string_agg(distinct+json_build_object

From: Oscar van Baten <info(at)oxcro(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Bugreport: incorrect result in 16.1 in query with string_agg(distinct+json_build_object
Date: 2024-01-03 18:19:36
Message-ID: CAC+9yuPN+hJUUGruaCtpgh+8Qvc1hcg8ZE9VMgE32tT+DJMhWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I've found that v16.1 has 2 new settings:
enable_presorted_aggregate
recursive_worktable_factor

When I SET enable_presorted_aggregate = OFF: (non default)

The result changes to the correct one.
QUERY PLAN
GroupAggregate (cost=94.38..128.98 rows=200 width=96)
Output: a.id, string_agg(DISTINCT 'val_B'::text, ', '::text),
(json_agg(json_build_object('val_1', a.id, 'val_2', (SubPlan 1))))::text
Group Key: a.id
-> Sort (cost=94.38..97.78 rows=1360 width=32)
Output: a.id
Sort Key: a.id
-> Seq Scan on pg_temp.alldata a (cost=0.00..23.60 rows=1360
width=32)
Output: a.id
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=32)
Output: a.id
Settings: effective_cache_size = '16179496kB', enable_presorted_aggregate =
'off', jit = 'off', temp_buffers = '64MB', search_path = 'public, public,
"$user"'
Query Identifier: 1935934247983289476

And SET jit = ON; kept the wrong result.

Regards,
Oscar

Op wo 3 jan 2024 om 19:01 schreef Oscar van Baten <info(at)oxcro(dot)com>:

> Hi Tom,
>
> There you go:
>
> v14.7
> QUERY PLAN
> |
>
> ----------------------------------------------------------------------------------------------------------------------------------------+
> GroupAggregate (cost=94.38..128.98 rows=200 width=96)
> |
> Output: a.id, string_agg(DISTINCT 'val_B'::text, ', '::text),
> (json_agg(json_build_object('val_1', a.id, 'val_2', (SubPlan 1))))::text|
> Group Key: a.id
> |
> -> Sort (cost=94.38..97.78 rows=1360 width=32)
> |
> Output: a.id
> |
> Sort Key: a.id
> |
> -> Seq Scan on pg_temp_684.alldata a (cost=0.00..23.60 rows=1360
> width=32) |
> Output: a.id
> |
> SubPlan 1
> |
> -> Result (cost=0.00..0.01 rows=1 width=32)
> |
> Output: a.id
> |
> Settings: effective_cache_size = '8014960kB', jit = 'off', search_path =
> 'public, public, "$user"' |
> Query Identifier: -1271634773772018342
> |
>
>
>
> v16.1
> QUERY PLAN
> |
>
> ----------------------------------------------------------------------------------------------------------------------------------------+
> GroupAggregate (cost=94.38..128.98 rows=200 width=96)
> |
> Output: a.id, string_agg(DISTINCT 'val_B'::text, ', '::text),
> (json_agg(json_build_object('val_1', a.id, 'val_2', (SubPlan 1))))::text|
> Group Key: a.id
> |
> -> Sort (cost=94.38..97.78 rows=1360 width=32)
> |
> Output: a.id
> |
> Sort Key: a.id
> |
> -> Seq Scan on pg_temp.alldata a (cost=0.00..23.60 rows=1360
> width=32) |
> Output: a.id
> |
> SubPlan 1
> |
> -> Result (cost=0.00..0.01 rows=1 width=32)
> |
> Output: a.id
> |
> Settings: effective_cache_size = '16179496kB', jit = 'off', temp_buffers =
> '64MB', search_path = 'public, public, "$user"' |
> Query Identifier: 1935934247983289476
> |
>
> Will try to compare the planner settings too.
>
> Regards,
> Oscar
>
> Op wo 3 jan 2024 om 17:56 schreef Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>
>> Oscar van Baten <info(at)oxcro(dot)com> writes:
>> > We've upgraded one of our instances from 14.10 to 16.1 at AWS
>> > All fine, except for a certain output which became different.
>> > At another instance running at 14.7 we are able to reproduce this
>> correct
>> > answer.
>> > Removing the `DISTINCT` or replacing `(SELECT a.id)` with `a.id` does
>> solve
>> > it.
>> > It looks like a reference issue.
>>
>> > You can reproduce with the query below.
>>
>> Hi, I failed to reproduce this. I get the expected output
>>
>> table_value | string_agg_value | zzz_2
>>
>>
>> -------------+------------------+------------------------------------------
>> val_A | val_B | [{"val_1" : "val_A", "val_2" : "val_A"}]
>> (1 row)
>>
>> both in HEAD and at 16.1 release (3edc6580c). It seems to me that
>> either you have some relevant non-default planner setting, or this
>> is an AWS-specific bug. Could we see the output of
>> EXPLAIN (VERBOSE, SETTINGS)
>> for this query on the troublesome system?
>>
>> We do have an open report of a problem with DISTINCT aggregates,
>>
>> https://www.postgresql.org/message-id/flat/18264-e363593d7e9feb7d%40postgresql.org
>> but it's unclear whether this is the same.
>>
>> regards, tom lane
>>
>>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-01-03 18:20:27 Re: Bugreport: incorrect result in 16.1 in query with string_agg(distinct+json_build_object
Previous Message Oscar van Baten 2024-01-03 18:01:20 Re: Bugreport: incorrect result in 16.1 in query with string_agg(distinct+json_build_object