pgsql: Avoid unnecessary post-sort projection

From: Richard Guo <rguo(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Avoid unnecessary post-sort projection
Date: 2024-09-04 03:20:48
Message-ID: E1slgZb-0003Ur-D6@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Avoid unnecessary post-sort projection

When generating paths for the ORDER BY clause, one thing we need to
ensure is that the output paths project the correct final_target. To
achieve this, in create_ordered_paths, we compare the pathtarget of
each generated path with the given 'target', and add a post-sort
projection step if the two targets do not match.

Currently we perform a simple pointer comparison between the two
targets. It turns out that this is not sufficient. Each sorted_path
generated in create_ordered_paths initially projects the correct
target required by the preceding steps of sort. If it is the same
pointer as sort_input_target, pointer comparison suffices, because
sort_input_target is always identical to final_target when no
post-sort projection is needed.

However, sorted_path's initial pathtarget may not be the same pointer
as sort_input_target, because in apply_scanjoin_target_to_paths, if
the target to be applied has the same expressions as the existing
reltarget, we only inject the sortgroupref info into the existing
pathtargets, rather than create new projection paths. As a result,
pointer comparison in create_ordered_paths is not reliable.

Instead, we can compare PathTarget.exprs to determine whether a
projection step is needed. If the expressions match, we can be
confident that a post-sort projection is not required.

It could be argued that this change adds extra check cost each time we
decide whether a post-sort projection is needed. However, as
explained in apply_scanjoin_target_to_paths, by avoiding the creation
of projection paths, we save effort both immediately and at plan
creation time. This, I think, justifies the extra check cost.

There are two ensuing plan changes in the regression tests, but they
look reasonable and are exactly what we are fixing here. So no
additional test cases are added.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: Peter Eisentraut, David Rowley, Tom Lane
Discussion: https://postgr.es/m/CAMbWs48TosSvmnz88663_2yg3hfeOFss-J2PtnENDH6J_rLnRQ@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/9626068f13338f79ba183b4cf3c975e22c98c575

Modified Files
--------------
src/backend/optimizer/plan/planner.c | 14 +++++++++----
src/test/regress/expected/select_distinct_on.out | 26 +++++++++++-------------
2 files changed, 22 insertions(+), 18 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Amit Kapila 2024-09-04 03:37:58 pgsql: Collect statistics about conflicts in logical replication.
Previous Message Michael Paquier 2024-09-04 01:23:04 pgsql: Fix inconsistent LWLock tranche name "CommitTsSLRU"