Sv: Not counting duplicates of declared pratition in OVER()-clause

From: Andreas Joseph Krogh <andreas(at)visena(dot)com>
To: pgsql-sql(at)lists(dot)postgresql(dot)org
Subject: Sv: Not counting duplicates of declared pratition in OVER()-clause
Date: 2017-11-28 18:03:29
Message-ID: VisenaEmail.e2.91a23a9fffcf3cd3.16003ca4629@tc7-visena
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

På tirsdag 28. november 2017 kl. 18:54:47, skrev Andreas Joseph Krogh <
andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>>:
Hi.
 
I'm trying to prevent duplicate values from being part of SUM().
 
(complete schema with INSERTs below)
 
I have this query to count all log-entries per activity per month in a
sub-query, then adding a value from another table in the outer query, which I'd
then like to sum but only count values from the same month once:
 
SELECT info.*, stuff.value + info.total_for_month AS new_value , SUM
(stuff.value +info.total_for_month) OVER () AS total_new_value_sum FROM (
SELECT DISTINCT date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE)
AS month , log.activity_id , count(log.entity_id) OVER(partition by date_trunc(
'month', log.start_date::TIMESTAMP WITHOUT TIME ZONE), log.activity_id) AS
num_logs_per_activity ,count(log.entity_id) OVER (partition by date_trunc(
'month', log.start_date::TIMESTAMP WITHOUT TIME ZONE)) AS total_for_month FROM
log_entrylog ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT
TIME ZONE) ASC, log.activity_id ASC ) AS info LEFT OUTER JOIN stuff ON info
.month = stuff.month ;

 
It seems I forgot to turn my brain on, sorry.
 
This query gives me what I want (using row_number() and an outer query with
FILTER on rownum=1):
 
SELECT q.* , SUM(q.new_value) FILTER (WHERE q.rownum = 1) OVER() AS
total_new_value_sumFROM ( SELECT info.*, stuff.value + info.total_for_month AS
new_value ,row_number() OVER (partition by info.month) as rownum FROM ( SELECT
DISTINCT date_trunc('month', log.start_date::TIMESTAMP WITHOUT TIME ZONE) AS
month , log.activity_id , count(log.entity_id) OVER(partition by date_trunc(
'month', log.start_date::TIMESTAMP WITHOUT TIME ZONE), log.activity_id) AS
num_logs_per_activity ,count(log.entity_id) OVER (partition by date_trunc(
'month', log.start_date::TIMESTAMP WITHOUT TIME ZONE)) AS total_for_month FROM
log_entrylog ORDER BY date_trunc('month', log.start_date::TIMESTAMP WITHOUT
TIME ZONE) ASC, log.activity_id ASC ) AS info LEFT OUTER JOIN stuff ON info
.month = stuff.month ) q ;
 
Gives:
month activity_id num_logs_per_activity total_for_month new_value rownum
total_new_value_sum 2017-01-01 00:00:00.000000 1 4 8 30 1 141 2017-01-01
00:00:00.000000 2 4 8 30 2 141 2017-02-01 00:00:00.000000 1 10 12 111 1 141
2017-02-01 00:00:00.000000 2 2 12 111 2 141 2017-03-01 00:00:00.000000 1 1 1
NULL 1 141
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>

 

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Ertan Küçükoğlu 2017-12-07 09:59:27 Timestamp alculation identical to Microsoft Excel results
Previous Message Andreas Joseph Krogh 2017-11-28 17:54:47 Not counting duplicates of declared pratition in OVER()-clause