From: | Vik Fearing <vik(at)postgresfriends(dot)org> |
---|---|
To: | Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> |
Cc: | jchampion(at)timescale(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Row pattern recognition |
Date: | 2023-07-28 08:56:26 |
Message-ID: | 60651930-70bb-c849-1862-e8f7eb109094@postgresfriends.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7/28/23 09:09, Tatsuo Ishii wrote:
>>> We already recalculate a frame each time a row is processed even
>>> without RPR. See ExecWindowAgg.
>>
>> Yes, after each row. Not for each function.
>
> Ok, I understand now. Closer look at the code, I realized that each
> window function calls update_frameheadpos, which computes the frame
> head position. But actually it checks winstate->framehead_valid and if
> it's already true (probably by other window function), then it does
> nothing.
>
>>> Also RPR always requires a frame option ROWS BETWEEN CURRENT ROW,
>>> which means the frame head is changed each time current row position
>>> changes.
>>
>> Off topic for now: I wonder why this restriction is in place and
>> whether we should respect or ignore it. That is a discussion for
>> another time, though.
>
> My guess is, it is because other than ROWS BETWEEN CURRENT ROW has
> little or no meaning. Consider following example:
Yes, that makes sense.
>>>> I strongly disagree with this. Window function do not need to know
>>>> how the frame is defined, and indeed they should not.
>>> We already break the rule by defining *support functions. See
>>> windowfuncs.c.
>> The support functions don't know anything about the frame, they just
>> know when a window function is monotonically increasing and execution
>> can either stop or be "passed through".
>
> I see following code in window_row_number_support:
>
> /*
> * The frame options can always become "ROWS BETWEEN UNBOUNDED
> * PRECEDING AND CURRENT ROW". row_number() always just increments by
> * 1 with each row in the partition. Using ROWS instead of RANGE
> * saves effort checking peer rows during execution.
> */
> req->frameOptions = (FRAMEOPTION_NONDEFAULT |
> FRAMEOPTION_ROWS |
> FRAMEOPTION_START_UNBOUNDED_PRECEDING |
> FRAMEOPTION_END_CURRENT_ROW);
>
> I think it not only knows about frame but it even changes the frame
> options. This seems far from "don't know anything about the frame", no?
That's the planner support function. The row_number() function itself
is not even allowed to *have* a frame, per spec. We allow it, but as
you can see from that support function, we completely replace it.
So all of the partition-level window functions are not affected by RPR
anyway.
>> I have two comments about this:
>>
>> It isn't just for convenience, it is for correctness. The window
>> functions do not need to know which rows they are *not* operating on.
>>
>> There is no such thing as a "full" or "reduced" frame. The standard
>> uses those terms to explain the difference between before and after
>> RPR is applied, but window functions do not get to choose which frame
>> they apply over. They only ever apply over the reduced window frame.
>
> I agree that "full window frame" and "reduced window frame" do not
> exist at the same time, and in the end (after computation of reduced
> frame), only "reduced" frame is visible to window
> functions/aggregates. But I still do think that "full window frame"
> and "reduced window frame" are important concept to explain/understand
> how PRP works.
If we are just using those terms for documentation, then okay.
--
Vik Fearing
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-07-28 09:19:22 | Re: Support worker_spi to execute the function dynamically. |
Previous Message | Etsuro Fujita | 2023-07-28 08:55:52 | Re: postgres_fdw: wrong results with self join + enable_nestloop off |