proposal: add window function to 8.4

From: H(dot)Harada <umi(dot)tanuki(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: proposal: add window function to 8.4
Date: 2008-06-09 12:32:58
Message-ID: e08cc0400806090532p307f0687i3a246aa5f9f293e8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This topic has been discussed on this list and many user expect that
PostgreSQL implements it.
I'd like to work on this feature and hope that we can include it on 8.4.

Former discussions are here:

http://archives.postgresql.org/pgsql-hackers/2004-11/msg01093.php

http://archives.postgresql.org/pgsql-hackers/2007-01/msg00861.php

How it works and examples:
SELECT dept, empno,
RANK() OVER(PARTITION BY dept ORDER BY age) as age_rank,
RANK() OVER(PARTITION BY dept ORDER BY salary) as salary_rank,
SUM(salary) OVER(PARTITION BY dept ORDER BY age) as run_total
FROM employees ORDER BY 1, 3, 4;

dept empno age_rank salary_rank run_total
ENG 2 1 2 40000
ENG 1 2 1 90000
QA 3 1 2 30000
QA 4 2 1 65000
(ref.: http://www.gavinsherry.org/talks/window_talk.pdf)

My current idea and concept:
- add "window function" and treat it specially such like aggregate
function and setof function.
- some features may be dropped at the first release, considering to
support them later.
- to formulate and to let it work properly are primary, performance
optimization is secondary.

From my survey around web and list archive, the points are:
- what is "window function" rank(), rank_dense(), lead() and others?
First of all, we should define the window function such like
aggregate function. In my opinion, there are two types of functions in
OVER () call context. One of them is aggregate, and the other is
window (ranking) function. Sum() in a query like

SELECT empno, sum(salary) OVER (PARTITION BY depno) FROM empsalary;

is obviously aggregate function. This type of function can be used as
it is now. Only executer will change its behavior.
Current pgsql feature sets lack window function like rank(). This
type of function must 1) have a context such as SETOF functions, 2)
return values until executor says "DONE", rather than function itself
says "DONE" as in SETOF function, and 3) know about other tuples
(mainly ORDER BY clause), as rank() doesn't take any arguments but
knows the ranking boundary. I suppose that pgsql have new function
system for these types called "window function".
Once we can define window function, users have extensibility to this
type of function.

- do we really need FRAME clause?
From my survey, Oracle and DB2 have FRAME clause

SELECT empno, sum(salary) OVER (ORDER BY empno ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) FROM empsalary;

while MS SQL Server doesn't (note that the literal from "ROWS" to
"CURRENT ROW" is called FRAME clause). To implement FRAME clause is
much more complicated than only PARTITION and ORDER clause support
because we need random access to result tuples. Though we will be
expected to support FAME clause in long-term, for the first release it
might be able to be put off. Even though rank() doesn't support FRAME
clause (only PARTITION and ORDER) it is so useful, more than now at
least.

- execution order
In SQL:2003, the execution order is defined as

where -> group by -> having -> (windowing) * N -> order by (outer,
currently existing one)
where windowing is
partition by -> order by -> (framing) * N

But Oracle seems that it has another order

(windowing) * N -> where -> group by ... and so on.

which is better for us? With Oracle's one you can say
SELECT empno, rank() OVER (PARTITION BY depno ORDER BY saraly) AS
topsalary FROM empsalary
WHERE topsaraly < 3
to get the top 3 people taking heighest salary. In the SQL standard,
you should the nest query.
I insist the first (standard) one is better because we may want use
the result of normal aggregate in OVER clause.

- plan and node
Currently in my mind the execution strategy could be:

1. Where & GroupBy & Having
|
2. SortBy partitionClause, orderByClause
|
3. Window
foreach partition:
if not there_is_frame():
aggvalue = null
foreach row in partition:
aggvalue = agg_trans_func(aggvalue)
aggvalue = agg_final_func(aggvalue)

foreach row in partition:
if has frame clause:
aggvalue = null
frame = make_frame()
foreach row_in_frame:
aggvalue = aggregate_trans_func(aggvalue)
aggvalue = aggregate_final_func(aggvalue)

set aggvalue to row
val = window_func()
set val to row
goto 2. if another window remained
|
4. SortBy ORDER BY clause (outer) & Limit
|
5. Output

This pseudo code is quite simple and stupid. We may optimize it by
splitting tasks with MergeJoin, etc. or think about process 2. that
collect same PARTITION clauses to reduce sort operation. Or to use
Hash Agg to create PARTITION. But let's be stupid at first.
Optimization is secondary.

References:
description by Stefan DeBloch
http://wwwdvs.informatik.uni-kl.de/courses/NEDM/WS0607/Vorlesungsunterlagen/NEDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf
via Wikipedia[Select (SQL)] http://en.wikipedia.org/wiki/Select_(SQL)

BTW, what about Bizgres now?

I appreciate for any comments and dis -:)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-06-09 12:37:52 Re: TODO, FAQs to Wiki?
Previous Message Andrew Chernow 2008-06-09 10:02:05 Re: libpq support for arrays and composites