Re: v17 Possible Union All Bug

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>
Cc: PostgreSQL Bug List <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: v17 Possible Union All Bug
Date: 2024-01-29 21:31:36
Message-ID: CAKFQuwbp8cFu_w88cD8M9s6YppnBL1KkJOC9ht8E7MLscUTqvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jan 29, 2024 at 9:19 AM David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Fri, Jan 26, 2024 at 5:36 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
>> On Sat, 27 Jan 2024 at 13:19, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> > Are the results correct if you SET enable_presorted_aggregate=0;?
>>
>>
> Apparently I didn't reply-all...
>
> Yes, the problem goes away when I disabled presorted_aggregate
>
> I'll see if that knowledge can help build a better reproducer.
>
>
I've deferred doing a better reproducer for the moment, I reliably got:
initdb
psql --file ~/unionall-repro.sql
psql -c 'select * from rolegraph.role_graph_broken;'

oid | role_type | rolname | rolsuper |
administration
-------+-----------+--------------------+----------+-----------------------------------------------
16390 | User | u6_green_leader_su | f | mog of
g6a_fixedops_manager_su from superuser+
| | | | mog of
g6c_service_manager_su from superuser +
| | | | mog of
g6d_service_advisor_su from superuser +
| | | | mog of
g6e_service_tech_su from superuser +
| | | | mog of
g6c_service_manager_su from cr_admin
(1 row)

to be produced for the bad bisect result and the correct nested result for
*manager* to produce on the good result.

❯ git bisect bad
0452b461bc405e6d35d8a14c02813c15e28ae516 is the first bad commit
commit 0452b461bc405e6d35d8a14c02813c15e28ae516
Author: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
Date: Sun Jan 21 22:21:36 2024 +0200

Explore alternative orderings of group-by pathkeys during optimization.

When evaluating a query with a multi-column GROUP BY clause, we can
minimize
sort operations or avoid them if we synchronize the order of GROUP BY
clauses
with the ORDER BY sort clause or sort order, which comes from the
underlying
query tree. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for
Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.

The ordering of group keys may interact with other parts of the query,
some of
which may not be known while planning the grouping. For example, there
may be
an explicit ORDER BY clause or some other ordering-dependent operation
higher up
in the query, and using the same ordering may allow using either
incremental
sort or even eliminating the sort entirely.

The patch always keeps the ordering specified in the query, assuming
the user
might have additional insights.

This introduces a new GUC enable_group_by_reordering so that the
optimization
may be disabled if needed.

Discussion:
https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Author: Andrei Lepikhov, Teodor Sigaev
Reviewed-by: Tomas Vondra, Claudio Freire, Gavin Flower, Dmitry Dolgov
Reviewed-by: Robert Haas, Pavel Borisov, David Rowley, Zhihong Yu
Reviewed-by: Tom Lane, Alexander Korotkov, Richard Guo, Alena Rybakina

src/backend/optimizer/path/equivclass.c | 13 +-
src/backend/optimizer/path/pathkeys.c | 252 +++++++++++++++
src/backend/optimizer/plan/planner.c | 424
++++++++++++--------------
src/backend/utils/misc/guc_tables.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/nodes/pathnodes.h | 10 +
src/include/optimizer/paths.h | 2 +
src/test/regress/expected/aggregates.out | 202 ++++++++++++
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/sql/aggregates.sql | 75 +++++
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 770 insertions(+), 223 deletions(-)

postgres-head (0452b46) (BISECTING)
❯ git bisect log
# bad: [97287bdfae41b8ea16b27dccb63771fcc196a55a] Move is_valid_ascii() to
ascii.h.
# good: [aa817c7496575b37fde6ea5e0cd65b26f29ea532] Avoid useless
ReplicationOriginExitCleanup locking
git bisect start '97287bdfae' 'aa817c7496'
# bad: [752533d40fd50de0b09d4b956cc32c38f5df2f05] Test EXPLAIN (FORMAT
JSON) ... XMLTABLE
git bisect bad 752533d40fd50de0b09d4b956cc32c38f5df2f05
# good: [7b1dbf0a8d1d4e1e6d01a76dc45a3216e8a16d94] More documentation
updates for incremental backup.
git bisect good 7b1dbf0a8d1d4e1e6d01a76dc45a3216e8a16d94
# good: [c64086b79dbad19e4ee0af8d19e835111aa87bd5] Reorder actions in
ProcArrayApplyRecoveryInfo()
git bisect good c64086b79dbad19e4ee0af8d19e835111aa87bd5
# good: [7ab80ac1caf9f48064190802e1068ef89e2883c4] Generalize the common
code of adding sort before processing of grouping
git bisect good 7ab80ac1caf9f48064190802e1068ef89e2883c4
# bad: [49cd2b93d7dbceefdf9a71cc301d284a2dd234c3] Add test module
injection_points
git bisect bad 49cd2b93d7dbceefdf9a71cc301d284a2dd234c3
# bad: [c03d91d9be378975bcdbfa3e5d40e17288e6f13f] Fix table name collision
in tests in 0452b461bc
git bisect bad c03d91d9be378975bcdbfa3e5d40e17288e6f13f
# bad: [0452b461bc405e6d35d8a14c02813c15e28ae516] Explore alternative
orderings of group-by pathkeys during optimization.
git bisect bad 0452b461bc405e6d35d8a14c02813c15e28ae516
# first bad commit: [0452b461bc405e6d35d8a14c02813c15e28ae516] Explore
alternative orderings of group-by pathkeys during optimization.

David J.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Korotkov 2024-01-29 21:45:24 Re: v17 Possible Union All Bug
Previous Message David Rowley 2024-01-29 21:18:37 Re: BUG #18295: In PostgreSQL a unique index on targeted columns is sufficient to support a foreign key