Re: complex custom aggregate function

From: Scara Maccai <m_lists(at)yahoo(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: complex custom aggregate function
Date: 2009-02-09 09:16:20
Message-ID: 481041.38433.qm@web24607.mail.ird.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I think I've found a solution myself to the moving average problem, so I'm posting it here in case it works for some others (and so that everybody can check that I'm doing it right and in the "best" way...)
Basically I'm
1) saving all the couples (timestamp, double) of an aggregation into an array using array_accum
2) unnesting the same array to get a table of (timestamp, double) tuples
3) calculating the moving average using the "self join" trick (found in "Transact SQL") of the table got from 2)
4) getting the max out of the average values.
Before the code, some questions:
1) I tried doing it using python, but the array was passed as a string (I didn't find it in the docs though...)
2) I didn't understand why the "array_enum" function is present only for int arrays in the contrib "intagg" module; couldn't it be a function with "anyarray" as argument? Such as:
CREATE OR REPLACE FUNCTION array_enum(anyarray)
RETURNS setof anyelement
AS 'array_unnest'
LANGUAGE INTERNAL IMMUTABLE STRICT;

3) Am I doing something wrong in the code below? Could I do it better?
CREATE AGGREGATE array_accum (anyelement)
(
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);
CREATE TYPE timemax_t AS (
t timestamp,
v double precision
);
CREATE OR REPLACE FUNCTION timemax_array_enum(timemax_t[])
RETURNS setof timemax_t
AS 'array_unnest'
LANGUAGE INTERNAL IMMUTABLE STRICT;

CREATE or replace FUNCTION movingavgmax(a timemax_t[], nquarters int4)
RETURNS double precision
AS $$
DECLARE
timemax_t_row timemax_t;
retval double precision;
BEGIN
select max(mx) into retval from
(
select x..t, AVG(y.v) as mx
from timemax_array_enum(a) as x, timemax_array_enum(a) as y
where
x.t between y.t and y.t+((nquarters-1)*15||' minutes')::interval
group by x.t
having count(y.v)=nquarters
) as subs;
return retval;
END;
$$ LANGUAGE plpgsql;

-- example usage:
select movingavgmax(array_accum((quarter,value)::timemax_t), 3) from test where id = 10 AND quarter between '2008-12-01 00:00:00' and '2008-12-01 10:00:00' ;

----- Messaggio originale -----
> Da: Scara Maccai <m_lists(at)yahoo(dot)it>
> A: pgsql-general(at)postgresql(dot)org
> Inviato: Venerdì 30 gennaio 2009, 9:35:53
> Oggetto: complex custom aggregate function
>
> Hi all,
>
> I have a table like:
>
> value int,
> quarter timestamp
>
> I need an aggregate function that gives back the maximum "value" using
> this algorithm:
>
> AVG of the first hour (first 4 quarters) (AVG0)
> same as above, but 1 quarter later (AVG1)
> ...
> same as above, but n quarters later (AVGn)
>
> result: the quarter where AVGn was MAX.
>
> Example:
>
> quarter value AVGn
>
> 2008-01-01 00:00 10
> 2008-01-01 00:15 15
> 2008-01-01 00:30 5
> 2008-01-01 00:45 20 -> 12.5 ((10+15+5+20)/4)
> 2008-01-01 01:15 2 -> 21 ((15+5+20+2)/4)
> 2008-01-01 01:30 30 -> 14.25 ((5+20+2+30)/4))
>
> the result should be ('2008-01-01 00:15', 21)
>
>
>
> It would be very easy if the input to the custom aggregate function was
> ordered (because I would keep 4 internal counters), but I guess there's
> no way of "forcing" the ordering of the input to the function, right?
>
> So I have to cache all the (quarter,value) couples and give back a
> result at the end, right?

Passa a Yahoo! Mail.

La webmail che ti offre GRATIS spazio illimitato,
antispam e messenger integrato.
http://it.mail.yahoo.com/              

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2009-02-09 09:17:40 Re: Out of memory on SELECT in 8.3.5
Previous Message Grzegorz Jaśkiewicz 2009-02-09 09:01:46 Re: Out of memory on SELECT in 8.3.5