Re: cumulative count

From: Gerhard Heift <ml-postgresql-20081012-3518(at)gheift(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: cumulative count
Date: 2008-12-04 18:49:59
Message-ID: 20081204184959.GB8382@toaster.kawo1.rwth-aachen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Dec 04, 2008 at 07:32:59PM +0100, Harald Fuchs wrote:
> In article <49381902(dot)7080209(at)gmail(dot)com>,
> Carson Farmer <carson(dot)farmer(at)gmail(dot)com> writes:
>
> > date | user
> > ------------------+---------------------
> > 20050201 | Bill
> > 20050210 | Steve
> > 20050224 | Sally
> > 20050311 | Martha
> > 20050316 | Ryan
> > 20050322 | Phil
> > 20050330 | William
> > 20050415 | Mary
> > 20050428 | Susan
> > 20050503 | Jim
>
> > and I want to run a query that returns a *count* of the number of
> > users *each month*, ordered by year and *month*, with an additional
> > column that is a *running total of the count*, as in:
>
> > year | month | count | run_count
> > -------------------+----------------+----------------+-----------------
> > 2005 | 02 | 3 | 3
> > 2005 | 03 | 4 | 7
> > 2005 | 04 | 2 | 9
> > 2005 | 05 | 1 | 10
>
> > I can get almost everything I want with:
>
> > SELECT a.year, a.month, a.count, a.month_name, sum(b.count) AS total
> > FROM (SELECT EXTRACT(year from added_date) AS year,
> > EXTRACT(month FROM added_date) AS month,
> > TO_CHAR(added_date, 'Month') AS month_name,
> > COUNT(*) AS count FROM users_table GROUP BY 1, 2, 3) AS a,
> > (SELECT EXTRACT(year FROM added_date) AS year,
> > EXTRACT(month FROM added_date) AS month,
> > TO_CHAR(added_date, 'Month') AS month_name,
> > COUNT(*) AS count
> > FROM users_table GROUP BY 1, 2, 3) AS b
> > WHERE a.year >= b.year AND a.month >= b.month
> > GROUP BY 1, 2, 3, 4
> > ORDER BY a.year, a.month asc;
>
> > but I can't quite figure out the running total of the count. The above
> > example works right up to the end of the first year, then the values
> > no longer make sense. My guess is it's something to do with my WHERE
> > clause, but I can't think of a better way to do things.
>
> Yes, your WHERE condition is the problem. It should be
> WHERE a.year > b.year OR a.year = b.year AND a.month > b.month.
>
> You could simplify the date logic by doing the year/month split later, e.g.
>
> CREATE TEMP TABLE tmp AS
>
> SELECT extract(YEAR FROM t1.dt) AS year,
> extract(MONTH FROM t1.dt) AS month,
> t1.count,
> sum(t2.count) AS run_count
> FROM tmp t1
> LEFT JOIN tmp t2 ON t2.dt <= t1.dt
> GROUP BY year, month, t1.count
> ORDER BY year, month;

What about:

SELECT extract(YEAR FROM t1.dt) AS year,
extract(MONTH FROM t1.dt) AS month,
t1.count,
sum(t2.count) AS run_count
FROM (
SELECT date_trunc('month', date) AS dt, count(*) AS count
FROM users_table
GROUP BY dt
) AS t1
LEFT JOIN tmp t2 ON t2.dt <= t1.dt
GROUP BY year, month, t1.count
ORDER BY year, month;

Regards,
Gerhard

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2008-12-04 18:57:35 Re: serial
Previous Message Rob Richardson 2008-12-04 18:44:41 Automatic insert statement generator?