From: | Harald Fuchs <hari(dot)fuchs(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: cumulative count |
Date: | 2008-12-04 18:32:59 |
Message-ID: | pur64n4wpw.fsf@srv.protecting.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
In article <49381902(dot)7080209(at)gmail(dot)com>,
Carson Farmer <carson(dot)farmer(at)gmail(dot)com> writes:
> date | user
> ------------------+---------------------
> 20050201 | Bill
> 20050210 | Steve
> 20050224 | Sally
> 20050311 | Martha
> 20050316 | Ryan
> 20050322 | Phil
> 20050330 | William
> 20050415 | Mary
> 20050428 | Susan
> 20050503 | Jim
> and I want to run a query that returns a *count* of the number of
> users *each month*, ordered by year and *month*, with an additional
> column that is a *running total of the count*, as in:
> year | month | count | run_count
> -------------------+----------------+----------------+-----------------
> 2005 | 02 | 3 | 3
> 2005 | 03 | 4 | 7
> 2005 | 04 | 2 | 9
> 2005 | 05 | 1 | 10
> I can get almost everything I want with:
> SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> FROM (SELECT EXTRACT(year from added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> (SELECT EXTRACT(year FROM added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count
> FROM users_table GROUP BY 1, 2, 3) AS b
> WHERE a.year >= b.year AND a.month >= b.month
> GROUP BY 1, 2, 3, 4
> ORDER BY a.year, a.month asc;
> but I can't quite figure out the running total of the count. The above
> example works right up to the end of the first year, then the values
> no longer make sense. My guess is it's something to do with my WHERE
> clause, but I can't think of a better way to do things.
Yes, your WHERE condition is the problem. It should be
WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.
You could simplify the date logic by doing the year/month split later, e.g.
CREATE TEMP TABLE tmp AS
SELECT date_trunc('month', date) AS dt, count(*) AS count
FROM users_table
GROUP BY dt;
SELECT extract(YEAR FROM t1.dt) AS year,
extract(MONTH FROM t1.dt) AS month,
t1.count,
sum(t2.count) AS run_count
FROM tmp t1
LEFT JOIN tmp t2 ON t2.dt <= t1.dt
GROUP BY year, month, t1.count
ORDER BY year, month;
(AFAIK the WITH clause in PostgreSQL 8.4 would let you get rid of
the temp table.)
From | Date | Subject | |
---|---|---|---|
Next Message | Rob Richardson | 2008-12-04 18:44:41 | Automatic insert statement generator? |
Previous Message | Schwéger Gábor | 2008-12-04 18:30:14 | equal and like |