From: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | rjuju123(at)gmail(dot)com |
Subject: | Early Sort/Group resjunk column elimination. |
Date: | 2021-07-13 14:19:27 |
Message-ID: | 1710575.LarrGp0jcG@aivenronan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I would like to know if there is any interest in working to reduce the usage
and propagation of resjunk columns in the planner.
I think this topic is worth investigating, because most of the time when we
request a sorted path without ever needing the sort key afterwards, we still
carry the sort key to the final tuple, where the JunkFilter will finally get rid
of it.
Rationale
========
This would allow several optimizations.
1) Index not needing to output the column
This one was mentioned as far back as 2009 [1] and is still relevant today.
If we query SELECT a FROM t1 ORDER BY b; and we have an index, on b, we
shouldn't output b at all since we don't need it in the upper nodes. This
might not look like a huge problem by itself, but as noted in [1] it becomes
very expensive in the case of a functional index. This is alleviated for
IndexOnlyScan because it is capable of fetching the value from the index
itself, but it is still a problem.
Take this query as an example:
regression=# explain (verbose) select two from tenk2 order by hundred;
QUERY PLAN
-----------------------------------------------------------------------------------------
Index Scan using tenk2_hundred on public.tenk2 (cost=0.29..1574.20
rows=10000 width=8)
Output: two, hundred
(2 rows)
We should be able to transform it into:
regression=# explain (verbose) select two from tenk2 order by hundred;
QUERY PLAN
-----------------------------------------------------------------------------------------
Index Scan using tenk2_hundred on public.tenk2 (cost=0.29..1574.20
rows=10000 width=4)
Output: two
(2 rows)
2) Other nodes
Other nodes might benefit from it, for exemple in FDW. Right now the sort key
is always returned from the underlying FDW, but if the data can be sorted that
could be a net win.
3) Incremental Sort
While working on the patch to allow Sort nodes to use the datumOnly
optimization, a suggestion came up to also use it in the IncrementalSort. This
is not possible today because even if we don't need the previously-sorted
columns anymore, we still need to output them as resjunk columns.
4) Narrower tuples in dynamic shared memory.
DSM bandwidth is quite expensive, so if we can avoid exchanging some
attributes here it could be a net win.
Proposal
=======
I've been trying to test this idea using a very simple approach. If that is of
interest, I can clean up my branch and post a simple patch to discuss
specifics, but I'd like to keep a high level discussion first.
The idea would be to:
- "tag" resjunk TargetEntries according to why they were added. So a column
added as sort group clause would be tagged as such, and be recognisable
- in the planner, instead of using the whole processed target list to build
the finaltarget, we would remove resjunk entries we don't actually need (those
added only as sortgroup clauses as of now, but there may be other kind of
resjunk entries we can safely omit).
- inject those columns only when generating the input targets needed for
sorting, grouping, window functions and the likes.
Using only this already allows optimization number 1), because if no Sort node
needs to be added the pathtarget just cascade to the bottom of the path.
There is one big downside to this: it introduces a mismatch between the
finaltarget and the output of the previous node (for example sort). This adds a
costly Result node everywhere, performing an expensive projection instead of
the much simpler JunkFilter we currently have:
regression=# explain (verbose) select two from tenk2 order by four;
QUERY PLAN
------------------------------------------------------------------------------
Result (cost=1109.39..1234.39 rows=10000 width=4)
Output: two
-> Sort (cost=1109.39..1134.39 rows=10000 width=8)
Output: two, four
Sort Key: tenk2.four
-> Seq Scan on public.tenk2 (cost=0.00..445.00 rows=10000 width=8)
Output: two, four
(7 rows)
I think this is something that could easily be solved, either by teaching some
nodes to do simple projections, consisting only of removing / reordering some
attributes. This would match what ExecJunkFilter does, generalized to any
kind of "subset of attributes" projection.
Alternatively, we could also perform that at the Result level, leaving
individual nodes alone, by implementing a simpler result node using the
JunkFilter mechanism when it's possible (either with a full-blown
"SimpleResult" specific node, or a different execprocnode in the Result).
If the idea seems worthy, I'll keep working on it and send you a patch
demonstrating the idea.
[1] https://www.postgresql.org/message-id/flat/9957.1250956747%40sss.pgh.pa.us
Best regards,
--
Ronan Dunklau
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2021-07-13 14:22:51 | Re: Add option --drop-cascade for pg_dump/restore |
Previous Message | Tom Lane | 2021-07-13 14:15:54 | Re: Remove repeated calls to PQserverVersion |