Re: cumulative count

From: Harald Fuchs <hari(dot)fuchs(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: cumulative count
Date: 2008-12-04 18:32:59
Message-ID: pur64n4wpw.fsf@srv.protecting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In article <49381902(dot)7080209(at)gmail(dot)com>,
Carson Farmer <carson(dot)farmer(at)gmail(dot)com> writes:

> date | user
> ------------------+---------------------
> 20050201 | Bill
> 20050210 | Steve
> 20050224 | Sally
> 20050311 | Martha
> 20050316 | Ryan
> 20050322 | Phil
> 20050330 | William
> 20050415 | Mary
> 20050428 | Susan
> 20050503 | Jim

> and I want to run a query that returns a *count* of the number of
> users *each month*, ordered by year and *month*, with an additional
> column that is a *running total of the count*, as in:

> year | month | count | run_count
> -------------------+----------------+----------------+-----------------
> 2005 | 02 | 3 | 3
> 2005 | 03 | 4 | 7
> 2005 | 04 | 2 | 9
> 2005 | 05 | 1 | 10

> I can get almost everything I want with:

> SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> FROM (SELECT EXTRACT(year from added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> (SELECT EXTRACT(year FROM added_date) AS year,
> EXTRACT(month FROM added_date) AS month,
> TO_CHAR(added_date, 'Month') AS month_name,
> COUNT(*) AS count
> FROM users_table GROUP BY 1, 2, 3) AS b
> WHERE a.year >= b.year AND a.month >= b.month
> GROUP BY 1, 2, 3, 4
> ORDER BY a.year, a.month asc;

> but I can't quite figure out the running total of the count. The above
> example works right up to the end of the first year, then the values
> no longer make sense. My guess is it's something to do with my WHERE
> clause, but I can't think of a better way to do things.

Yes, your WHERE condition is the problem. It should be
WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.

You could simplify the date logic by doing the year/month split later, e.g.

CREATE TEMP TABLE tmp AS
SELECT date_trunc('month', date) AS dt, count(*) AS count
FROM users_table
GROUP BY dt;

SELECT extract(YEAR FROM t1.dt) AS year,
extract(MONTH FROM t1.dt) AS month,
t1.count,
sum(t2.count) AS run_count
FROM tmp t1
LEFT JOIN tmp t2 ON t2.dt <= t1.dt
GROUP BY year, month, t1.count
ORDER BY year, month;

(AFAIK the WITH clause in PostgreSQL 8.4 would let you get rid of
the temp table.)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob Richardson 2008-12-04 18:44:41 Automatic insert statement generator?
Previous Message Schwéger Gábor 2008-12-04 18:30:14 equal and like