pgsql: Add enable_presorted_aggregate GUC

From: David Rowley <drowley(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Add enable_presorted_aggregate GUC
Date: 2022-12-20 09:29:23
Message-ID: E1p7Yw6-004XR9-An@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Add enable_presorted_aggregate GUC

1349d279 added query planner support to allow more efficient execution of
aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to
that commit, the planner would only request that the lower planner produce
a plan with the order required for the GROUP BY clause and it would be
left up to nodeAgg.c to perform the final sort of records within each
group so that the aggregate transition functions were called in the
correct order. Now that the planner requests the lower planner produce a
plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind,
there is the possibility that the planner chooses a plan which could be
less efficient than what would have been produced before 1349d279.

While developing 1349d279, I had in mind that Incremental Sort would help
us in cases where an index exists only on the GROUP BY column(s).
Incremental Sort would just replace the implicit tuplesorts which are
being performed in nodeAgg.c. However, because the planner has the
flexibility to instead choose a plan which just performs a full sort on
both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is
potential for the planner to make a bad choice. The costing for
Incremental Sort is not perfect as it assumes an even distribution of rows
to sort within each sort group.

Here we add an escape hatch in the form of the enable_presorted_aggregate
GUC. This will allow users to get the pre-PG16 behavior in cases where
they have no other means to convince the query planner to produce a plan
which only sorts on the GROUP BY column(s).

Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/3226f47282a05979483475d1e4a11aab8c1bfc39

Modified Files
--------------
doc/src/sgml/config.sgml | 23 +++++++++++++++++++++++
src/backend/optimizer/path/costsize.c | 1 +
src/backend/optimizer/plan/planner.c | 3 ++-
src/backend/utils/misc/guc_tables.c | 15 +++++++++++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/optimizer/cost.h | 1 +
src/test/regress/expected/aggregates.out | 11 +++++++++++
src/test/regress/expected/sysviews.out | 3 ++-
src/test/regress/sql/aggregates.sql | 6 ++++++
9 files changed, 62 insertions(+), 2 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Etsuro Fujita 2022-12-20 10:12:31 pgsql: Allow batching of inserts during cross-partition updates.
Previous Message David Rowley 2022-12-20 08:49:19 pgsql: Improve the performance of the slab memory allocator