From: | Sam Mason <sam(at)samason(dot)me(dot)uk> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Relational Algebra and Aggregate Functions |
Date: | 2009-07-28 13:47:59 |
Message-ID: | 20090728134759.GA5407@samason.me.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, Jul 28, 2009 at 09:14:38AM -0400, Robert James wrote:
> Many wrote that the functional programming 'fold' is a good model for
> relational aggregate functions. I have a few difficulties with this:
> 1. fold doesn't offer any type of GROUP BY, which is an essential component
> of aggregation.
Not sure if I'd agree, a GROUP BY without any aggregate functions looks
pretty indistinguishable from just a DISTINCT on the same columns to me.
> 2. I don't believe fold can handle things like AVG() or STDDEV(). Can it?
An "aggregate" in a database bundles together several things that you
get direct access to in most functional languages; have a look at:
http://www.postgresql.org/docs/current/static/sql-createaggregate.html
In Haskell, to pick one particularly express language, I'd write AVG as:
avg = uncurry (/) . foldl (\(t,n) v -> (t+v,n+1)) (0,0)
running "avg [1,7,15]" gives me back approximately "7.6". If I move
things around so they look like they would in PG, I get:
CREATE AGGREGATE avg (float8) (
INITCOND = (0,0)
STYPE = (float8,float8),
SFUNC = foldl (\(t,n) v -> (t+v,n+1)),
FINALFUNC = uncurry (/),
);
> Conversely, fold can handle non-commutative and non-associative operators,
> which I don't believe can be used for aggregation.
Strictly speaking this is correct; but in practice PG doesn't care about
this. array_accum is a reasonable example (I think) as the resulting
array will be in the same order as it was given to the aggregation
function.
> 3. fold is defined on sequences, not sets. This doesn't seem to be a
> problem until you think about cases where there a duplicates of the
> aggregated field. (For instance, there are 10 bags each weighing 5 lbs, and
> you want SUM(weight) - you need to project weight onto a collection which
> allows for 10 occurences, or define the aggregate function to work on the
> whole tuple somehow... I know a man named Krug worked out a formal theory
> for this...)
I don't see why this is a problem at all; could you give a concrete
example?
--
Sam http://samason.me.uk/
From | Date | Subject | |
---|---|---|---|
Next Message | Jennifer Trey | 2009-07-28 14:00:04 | Moving from Windows to Ubuntu - Have a couple of questions |
Previous Message | Greg Stark | 2009-07-28 13:30:25 | Re: Video available for PGDay SJC '09 |