Re: Printing window function OVER clauses in EXPLAIN

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Printing window function OVER clauses in EXPLAIN
Date: 2025-03-09 14:12:25
Message-ID: 202503091412.mzlgmzmmzwim@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

Would it be possible and make sense to use notation of explicit WINDOW
clauses, for cases where multiple window functions invoke identical
window definitions? I'm thinking of something like

explain verbose SELECT
empno,
depname,
row_number() OVER testwin rn,
rank() OVER testwin rnk,
count(*) OVER testwin cnt
FROM empsalary
window testwin as
(PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

for which, with the patch, we'd get this

QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
WindowAgg (cost=74.64..101.29 rows=1070 width=68)
Output: empno, depname, row_number() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), count(*) OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), enroll_date
-> Sort (cost=74.54..77.21 rows=1070 width=44)
Output: depname, enroll_date, empno
Sort Key: empsalary.depname, empsalary.enroll_date
-> Seq Scan on pg_temp.empsalary (cost=0.00..20.70 rows=1070 width=44)
Output: depname, enroll_date, empno
(7 filas)

which is pretty ugly to read and requires careful tracking to verify
that they're all defined on the same window. Previously, we just get

QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────
WindowAgg (cost=74.64..101.29 rows=1070 width=68)
Output: empno, depname, row_number() OVER (?), rank() OVER (?), count(*) OVER (?), enroll_date
-> Sort (cost=74.54..77.21 rows=1070 width=44)
Output: depname, enroll_date, empno
Sort Key: empsalary.depname, empsalary.enroll_date
-> Seq Scan on pg_temp.empsalary (cost=0.00..20.70 rows=1070 width=44)
Output: depname, enroll_date, empno
(7 filas)

so it didn't matter.

I'd imagine something like

QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Window testwin AS (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
WindowAgg (cost=74.64..101.29 rows=1070 width=68)
Output: empno, depname, row_number() OVER testwin, rank() OVER testwin, count(*) OVER testwin, enroll_date
-> Sort (cost=74.54..77.21 rows=1070 width=44)
Output: depname, enroll_date, empno
Sort Key: empsalary.depname, empsalary.enroll_date
-> Seq Scan on pg_temp.empsalary (cost=0.00..20.70 rows=1070 width=44)
Output: depname, enroll_date, empno
(7 filas)

I imagine this working even if the user doesn't explicitly use a WINDOW
clause, if only because it makes the explain easier to read, and it's
much clearer if the user specifies two different window definitions.
So with David Johnston's example, something like

Window window1 AS (PARTITION BY depname ORDER BY enroll_date ROWS UNBOUNDED PRECEDING)
Window window2 AS (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN CURRENT ROW AND CURRENT ROW)
WindowAgg
Output: empno, depname, (row_number() OVER window1), rank() OVER window1, count(*) OVER window2, enroll_date
-> WindowAgg
Output: depname, enroll_date, empno, row_number() OVER window1, rank() OVER window1
-> Sort
Output: depname, enroll_date, empno
Sort Key: empsalary.depname, empsalary.enroll_date
-> Seq Scan on pg_temp.empsalary
Output: depname, enroll_date, empno

(Hmm, not sure if the Window clauses would be top-level or attached to
each WindowAgg in its own level.)

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-03-09 15:45:41 Re: Printing window function OVER clauses in EXPLAIN
Previous Message jian he 2025-03-09 13:41:49 Re: general purpose array_sort