Re: Limited Sort

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Limited Sort
Date: 2006-09-18 15:30:50
Message-ID: 1158593450.2696.115.camel@holly
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2006-09-18 at 14:26 +0100, Gregory Stark wrote:

> And lastly I find the idea of putting attention into OLAP functionality
> interesting. Does anyone have any ideas on where to start with that?

In SQL:2003 the OLAP functionality mostly relies on the concept of
sorted Windows over which "Windowed Aggregates" are defined.

Windows can be defined as
( CURRENT ROW |
( UNBOUNDED | n ) PRECEDING )
[( UNBOUNDED | n ) FOLLOWING]

e.g.

current situation:
CURRENT ROW

running average over last 10 rows
10 PRECEDING

running total
UNBOUNDED PRECEDING

The logic is set for the Window and doesn't make much sense to put that
in the aggregate. I imagine that we would do this by allowing the sort
results to be held as an internal cursor, so we can rewind by the
appropriate number. If preceding sort was an external sort, then we use
a tuplestore, modified so that it acts as a fixed-length FIFO queue so
we retain lookback capability.

n FOLLOWING would be emulated by fetching n rows ahead so that the
cursor was set at the correct place before starting calculation.

UNBOUNDED PRECEDING can be handled by caching the output of the last
final function in the query state, so that we do not need to have a
lookback capability at all in that case. (This may need to have an
additional function type definition, since it won't work with all
functions). e.g.

select val,
sum(val) over (order by val rows unbounded preceding)
from foo;

val running_SUM_val
1 1 we cache final_func output =1
2 3 =2+1, we then cache it again=3
3 6 =3+3, we then cache it again=6
4 10 =4+6

OLAP is a lot more complex than I've made out, but the only other thing
we need in tuplesort.c to support Windowed aggregates is:
ORDER BY foo (NULLS FIRST | NULLS LAST)
which we cannot yet specify.

I'm not planning to work on this just yet, but there's always next year.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-09-18 15:35:11 Re: [PATCHES] Linking on AIX (Was: Fix linking of OpenLDAP libraries )
Previous Message Harald Armin Massa 2006-09-18 15:29:34 Re: [PATCHES] Patch for UUID datatype (beta)