Re: v17 Possible Union All Bug

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>, PostgreSQL Bug List <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Richard Guo <guofenglinux(at)gmail(dot)com>
Subject: Re: v17 Possible Union All Bug
Date: 2024-02-04 04:57:11
Message-ID: e225591f-50bf-4b35-8b87-8107477804d2@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

And finally, I've got the synthetic test:

CREATE TABLE mess_grouping (x integer, y integer, z integer, w integer,
f integer);
INSERT INTO mess_grouping (x,y,z,w,f) (SELECT x%10, x % 2, x%2, 2, x%10
FROM generate_series(1,100) AS x);
ANALYZE mess_grouping;
SET enable_nestloop = 'off';
SET enable_hashjoin = 'off';
SET enable_hashagg = 'off';
SET enable_group_by_reordering = 'on';
SELECT c1.z, c1.w, string_agg(''::text, repeat(''::text, c1.f) ORDER BY
c1.x,c1.y)
FROM mess_grouping c1 JOIN mess_grouping c2 ON (c1.x = c2.f)
GROUP BY c1.w, c1.z;
SET enable_group_by_reordering = 'off';
SELECT c1.z, c1.w, string_agg(''::text, repeat(''::text, c1.f) ORDER BY
c1.x,c1.y)
FROM mess_grouping c1 JOIN mess_grouping c2 ON (c1.x = c2.f)
GROUP BY c1.w, c1.z;
DROP TABLE IF EXISTS mess_grouping CASCADE;

You can see here, that first query execution produces:
z | w | string_agg
---+---+------------
0 | 2 |
1 | 2 |
0 | 2 |
1 | 2 |
0 | 2 |
1 | 2 |
0 | 2 |
1 | 2 |
0 | 2 |
1 | 2 |
(10 rows)

and second execution gives correct result:
z | w | string_agg
---+---+------------
0 | 2 |
1 | 2 |
(2 rows)

The simple fix is in the attachment. But I'm not sure we should fix
GROUP-BY optimization instead of the more general issue.
The source of the problem is root->group_pathkeys, which contains
grouping pathkeys and aggregate pathkeys. For now, their 'sortref'
values could intersect, and we can differ which one references the query
target list and which one the target list of the aggregate.
So, I would like to get advice here: should we make a quick fix here, or
is such a mess in the sortref values not a mess and designed for some
purposes?

--
regards,
Andrei Lepikhov
Postgres Professional

Attachment Content-Type Size
naive_fix.diff text/plain 1.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2024-02-04 19:01:15 Re: BUG #18327: Column naming inconsistency for boolean literals in ELSE clauses of CASE expressions.
Previous Message PG Bug reporting form 2024-02-03 22:55:07 BUG #18328: yum update nothing provides libarmadillo.so.12()(64bit) needed by gdal36-libs-3.6.4-6PGDG.rhel9.x86