From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Kip Warner <kip(at)thevertigo(dot)com> |
Cc: | Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org> |
Subject: | Re: Tricky math in PSQL |
Date: | 2016-04-26 14:13:44 |
Message-ID: | CAHyXU0xDBQfROpb349C128PQ-kf=fMAgMSrndeQNmCVnOmvJmQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
On Mon, Apr 25, 2016 at 8:46 PM, Kip Warner <kip(at)thevertigo(dot)com> wrote:
> On Fri, 2016-04-22 at 14:37 -0500, Kevin Grittner wrote:
>> Please include such information in the post, either directly or as
>> an attachment. Messages on these lists go back over 18 years at
>> this point, and while you did flag that paste as "never delete",
>> who's to say whether pastebin will be around to keep the promise
>> that the URL will be valid 18 years from now?
>
> Hey Kevin,
>
> Noted. In general you are right. Because these queries are a bit long
> and sometimes wrap in ugly ways in folks' MUAs making them harder to
> read, you'll have to forgive me for this one.
No, we don't :-). Anyways, the relevant portion is here.
-- I would like to know the most efficient way of calculating the following,
-- preferably within the DBMS, to avoid having to shift intermediate
-- calculations back and forth between the client application. It is easy to
-- write and understand mathematically, but translating into PSQL is tricky...
--total_diff(id_a, id_b) =
-- (statistic_constant * statistic_diff(id_a, id_b)) +
-- (statistic_child_constant *
-- sigma(
-- sort_ordinal=1,
-- sort_ordinal=
-- min(COUNT(statistics_child.id=id_a),
COUNT(statistics_child.id=id_b)),
-- statistic_child_diff(id_a, id_b, sort_ordinal)) / constantG))
-- ...the sigma sums all statistic_child_diff() for the two id's to compare
-- for sort_ordinal 1,2,3,...to N where N is the lesser of the number of rows in
-- either statistics_child with id_a or id_b.
I'm not going to work out the whole problem for you, but you probably
want to start looking in the direction custom aggregate function.
Most people are not aware of this but custom aggregates can be applied
over a window. The state can be a composite type and thus hold
arbitrarily complex information that you can maintain as you walk
through the set (this works great for time series and things like
that).
Barring that, if you are still looking for solutions to this type of
problem in the database, I'd point you in the direction of pl/r. The
learning curve is somewhat steep but rewards mastery.
CREATE AGGREGATE RateChange(NUMERIC, NUMERIC, INT)
(
SFUNC = AggRateChange,
STYPE = NUMERIC
);
CREATE OR REPLACE FUNCTION AggRateChange(
State NUMERIC,
Seed NUMERIC,
RateChange NUMERIC,
NDigits INT) RETURNS NUMERIC AS
$$
SELECT ROUND(COALESCE($1, $2) * (1.0 + $3), NDigits);
$$ LANGUAGE SQL IMMUTABLE;
select ratechange(100, (random() * 0.05)::numeric, 3) over(order by s)
from generate_series(1,10) s;
ratechange
────────────
101.524
104.570
104.674
106.599
111.685
114.583
120.174
125.561
130.291
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Kip Warner | 2016-04-27 07:09:44 | Re: Tricky math in PSQL |
Previous Message | Kip Warner | 2016-04-26 01:46:18 | Re: Tricky math in PSQL |