From: | Rocky Ji <rockyji3438(at)gmail(dot)com> |
---|---|
To: | pgsql-sql(at)lists(dot)postgresql(dot)org |
Subject: | Why does GROUP BY reduce number of rows? |
Date: | 2019-03-01 12:55:44 |
Message-ID: | CAN2Gq-S_TSjGg0Aoygz6SWNgJY3+sf1mwuenpPjAhj8qLs7OBA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
This question is an attempt to understand/question implementation of SQL in
PostgreSQL.
While reading PostgreSQL v10 docs (PDF version), I noticed a footnote at
end of **2.5. Querying a Table** stating:
> the implementation of DISTINCT automatically orders the rows and so
ORDER BY is unnecessary. But this is not required by the SQL standard, and
current PostgreSQL does not guarantee that DISTINCT causes the rows to be
ordered
Let's call this error/mistake a "UDFS" (Unintended Deviation From
Standards). It is possible there are other such errors in PostgreSQL's SQL
implementation. (I'm glad PostgreSQL rectified it).
I think: reduction in number of rows in output when `GROUP BY` clause is
used is another UDFS.
Does the SQL Standard (can't buy, and I probably won't understand it)
explicitly state that **only the final row of an aggregate should be
displayed**? Why not first row of an aggregate group, why not n-th row?
What if important (i.e. "uniquely identifying" for "human eyes") data was
contained in first row of a group. Why does the underlying program (psql
server, I suppose) **decide** that preceding rows are not worth displaying?
I don't mean to question the developers but perhaps there are bigger
reasons why things are the way they are.
Cheers.
P.S. I am a SQL novice, this
<http://sqlite.1065341.n5.nabble.com/How-to-get-aggregate-without-reducing-number-of-rows-repeats-are-ok-td105867.html>
thread lead me to think like this. In it's context: Why do we need to
calculate aggregate for each row? Isn't it *wrong*?
From | Date | Subject | |
---|---|---|---|
Next Message | Achilleas Mantzios | 2019-03-01 13:17:48 | Re: Top 3 values for each group in PGSQL |
Previous Message | Thomas Kellerer | 2019-03-01 12:52:02 | Re: Top 3 values for each group in PGSQL |