From: | Iuri Sampaio <iuri(dot)sampaio(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: total and partial sums in the same query?? |
Date: | 2020-10-10 15:28:20 |
Message-ID: | ACAD364B-3DB0-49B1-835B-67F8C877A746@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
David,
Going further in our conversation. I went to PG's documentation and did a better research in order to come out with better solution/approaches.
Furthermore, based on your previous email, your words were “key” in the process. Thanks a lot!
“ … you should construct a simplified self-contained example and,… “
I reviewed the original query:
SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total
FROM qt_vehicle_ti v
RIGHT OUTER JOIN (
SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial
FROM qt_vehicle_ti
WHERE EXTRACT(MONTH FROM creation_date) = 10
GROUP BY type1) AS t
ON t.type1 = split_part(v.description, ' ', 25)
GROUP BY type, partial
and rewrote it to the following one:
WITH
cte1 AS (SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1),
cte2 AS (SELECT split_part(description, ' ', 25) AS type2, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type2)
SELECT type1, total, partial FROM cte1 JOIN cte2 ON cte1.type1 = cte2.type2;
Indeed! Performance is way better now. As well as readability, and less code written!
Nevertheless, I’m still reluctant to the necessity of using v_normalized. On the other hand my reluctancy comes from not entirely understanding this approach and the benefits of it.
Your words were:
“
A simple conditional (filter) count would be much easier to understand and should be much faster:
Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;
https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES <https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES>
I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.
"
So, feel free whether to go deeper continuing this discussion
Best wishes,
I
> On Saf. 23, 1442 AH, at 00:58, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Friday, October 9, 2020, Iuri Sampaio <iuri(dot)sampaio(at)gmail(dot)com <mailto:iuri(dot)sampaio(at)gmail(dot)com>> wrote:
> Hi David,
>
> RIGHT OUTER JOIN is the key!
>
> TOTAL
> SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type
>
> OCTOBER
> SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type
>
>
> FINAL
> SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total FROM qt_vehicle_ti v RIGHT OUTER JOIN ( SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1) AS t ON t.type1 = split_part(v.description, ' ', 25) GROUP BY type, partial
>
>
>
> Let me know if you would use a different approach
>
> The convention I try to observe when using outer joins is to use left join, not right (outer is implied). That said, you seem to have written a left join query since the totals, a superset of october, are on the left. Also, count(*) is my learned convention instead of count(1).
>
> A simple conditional (filter) count would be much easier to understand and should be much faster:
>
> Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;
>
> https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES <https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES>
>
> I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.
>
> David J.
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2020-10-10 17:55:20 | Re: total and partial sums in the same query?? |
Previous Message | David G. Johnston | 2020-10-10 03:58:19 | Re: total and partial sums in the same query?? |