Re: total and partial sums in the same query??

From: Iuri Sampaio <iuri(dot)sampaio(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: total and partial sums in the same query??
Date: 2020-10-10 15:28:20
Message-ID: ACAD364B-3DB0-49B1-835B-67F8C877A746@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

David,

Going further in our conversation. I went to PG's documentation and did a better research in order to come out with better solution/approaches.

Furthermore, based on your previous email, your words were “key” in the process. Thanks a lot!

“ … you should construct a simplified self-contained example and,… “

I reviewed the original query:

SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total
FROM qt_vehicle_ti v
RIGHT OUTER JOIN (
SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial
FROM qt_vehicle_ti
WHERE EXTRACT(MONTH FROM creation_date) = 10
GROUP BY type1) AS t
ON t.type1 = split_part(v.description, ' ', 25)
GROUP BY type, partial

and rewrote it to the following one:

WITH
cte1 AS (SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1),
cte2 AS (SELECT split_part(description, ' ', 25) AS type2, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type2)
SELECT type1, total, partial FROM cte1 JOIN cte2 ON cte1.type1 = cte2.type2;

Indeed! Performance is way better now. As well as readability, and less code written!

Nevertheless, I’m still reluctant to the necessity of using v_normalized. On the other hand my reluctancy comes from not entirely understanding this approach and the benefits of it.

Your words were:

A simple conditional (filter) count would be much easier to understand and should be much faster:

Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;

https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES <https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES>

I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.
"

So, feel free whether to go deeper continuing this discussion

Best wishes,
I

> On Saf. 23, 1442 AH, at 00:58, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Friday, October 9, 2020, Iuri Sampaio <iuri(dot)sampaio(at)gmail(dot)com <mailto:iuri(dot)sampaio(at)gmail(dot)com>> wrote:
> Hi David,
>
> RIGHT OUTER JOIN is the key!
>
> TOTAL
> SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type
>
> OCTOBER
> SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type
>
>
> FINAL
> SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total FROM qt_vehicle_ti v RIGHT OUTER JOIN ( SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1) AS t ON t.type1 = split_part(v.description, ' ', 25) GROUP BY type, partial
>
>
>
> Let me know if you would use a different approach
>
> The convention I try to observe when using outer joins is to use left join, not right (outer is implied). That said, you seem to have written a left join query since the totals, a superset of october, are on the left. Also, count(*) is my learned convention instead of count(1).
>
> A simple conditional (filter) count would be much easier to understand and should be much faster:
>
> Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;
>
> https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES <https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES>
>
> I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.
>
> David J.
>
>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message David G. Johnston 2020-10-10 17:55:20 Re: total and partial sums in the same query??
Previous Message David G. Johnston 2020-10-10 03:58:19 Re: total and partial sums in the same query??