From: | David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #10256: COUNT(*) behaves sort of like RANK() when used over a window containing an ORDER BY |
Date: | 2014-05-07 23:08:23 |
Message-ID: | 1399504103854-5803103.post@n5.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Tom Lane-2 wrote
> arnaud.mouronval@
> writes:
>> I discovered a problem while using a window that used an ORDER BY clause,
>> and using this window with ROW_NUMBER() and COUNT(*) at the same time.
>
> This looks correct to me. Keep in mind that the default window framing
> clause is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which really
> means "all rows up to and including the last sort peer of the current
> row".
> With no ORDER BY clause, all rows in the partition are peers, so you get a
> count covering the whole partition no matter which row you are on. With
> an ORDER BY, you get a count of only the rows up to the current row (or
> its last peer, when there are duplicate values in the ORDER BY columns).
>
>> I was expecting to get the same values in the last 2 columns.
>
> For that you'd want RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED
> FOLLOWING.
Tom's quote is pretty much an abbreviation of the following taken directly
from:
http://www.postgresql.org/docs/9.3/interactive/sql-select.html
"The default framing option is RANGE UNBOUNDED PRECEDING, which is the same
as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY, this
sets the frame to be all rows from the partition start up through the
current row's last peer. Without ORDER BY, all rows of the partition are
included in the window frame, since all rows become peers of the current
row."
A more useful description and example is found in the tutorial (about
halfway down, the sum(*) example):
http://www.postgresql.org/docs/9.3/interactive/tutorial-window.html
The term "peer" in the first quotation is confusing to me. My understanding
is that "PARTITION BY" defines which rows are "peers" - even if that isn't
the wording used. So now using "peers" in relation to the FRAME clause
confuses the issue. IMO the "since all rows..." is really superfluous - I
would understand:
"In the presence of an ORDER BY only rows up to the current row (including
all [partition] peers) are considered. Omitting an ORDER BY causes the
entire PARTITION to be considered the FRAME. Other frame definitions can be
requested by supplying an explicit FRAME clause."
I knew the answer but in the interest of looking through the user's eyes I
wanted to go take a look (again, done this before) and really am not
surprised that these kinds of questions are being asked. Just having to
scan through 5 sections of the documentation is difficult. That said, the
tutorial section and inline examples make it quite clear how the
with/without ORDER BY behavior of window functions (sum in the example)
cause different results.
At minimum the top of 9.3.4 could provide links to, and
descriptions/summaries of, what the other 4 sections cover and why things
are broken out the way they are. The other cross-references could point
back to that section-subsection as a kind of launch point: "Please see
section 3.5.1 for an overview of, and links to, other related sections."
Just some food for thought if anyone is industrious and annoyed enough to
act on it.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-10256-COUNT-behaves-sort-of-like-RANK-when-used-over-a-window-containing-an-ORDER-BY-tp5803073p5803103.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-05-07 23:58:16 | Re: BUG #7914: pg_dump aborts occasionally |
Previous Message | Emanuel Calvo | 2014-05-07 22:24:53 | Re: BUG #10255: CREATE COLLATION bug on 9.4 |