From: | Jason Aleski <jason(dot)aleski(at)gmail(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Better way to compute moving averages? |
Date: | 2015-01-23 22:58:51 |
Message-ID: | 54C2D22B.8050605@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I've been asked compute various moving averages of end of day sales by
store. I can do this for all rows with no problem (same query without
the WHERE clause towards the end). That query took 10-15 minutes to run
over approx 3.4 million rows. I'm sure they will want this information
to be added to the daily end of day reports. I can run the query below
(excluding the WHERE clause) but it takes almost as long to run one day
as it does the entire dataset. It looks like when I do the inner
select, it is still running over the entire dataset. I have added a
"WHERE eod_ts > CURRENT_TIMESTAMP - INTERVAL '365 days'" (as below) to
the inner query, which allows the query to run between 1-2 minutes.
Question 1) This seems to work, but was curious if there is a better way.
Question 2) Is there a way to specify a date, instead of using current
date and current_timestamp, as a variable and use that in the query? I
know I can do that in my Java program
using variables, but wasn't sure if there was a way to do this with a
function or stored procedure?
INSERT INTO historical_data_avg (store_id, date, avg7sales, avg14sales,
avg30sales, avg60sales, avg90sales, avg180sales) (
SELECT t1.store_id, t1.eod_ts, t1.avg5sales, t1.avg10sales,
t1.avg20sales, t1.avg50sales, t1.avg100sales, t1.avg180sales FROM (
SELECT
store_id,
eod_ts,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 4 FOLLOWING) AS avg5sales,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 9 FOLLOWING) AS avg10sales,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 19 FOLLOWING) AS avg20sales,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 49 FOLLOWING) AS avg50sales,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 99 FOLLOWING) AS avg100sales,
avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 179 FOLLOWING) AS avg200sales
FROM end_of_day_data
WHERE eod_ts > CURRENT_TIMESTAMP - INTERVAL '260 days'
GROUP BY store_id, eod_ts, eod_sales
ORDER BY ticker_id, eod_ts
) as t1
WHERE t1.eod_ts = current_date
);
From | Date | Subject | |
---|---|---|---|
Next Message | David G Johnston | 2015-01-24 00:50:27 | Re: Better way to compute moving averages? |
Previous Message | ktm@rice.edu | 2015-01-23 20:06:57 | Re: [SQL] commit inside a function failing |