Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

From: Oliver Ford <ojford(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: krasiyan(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, vik(at)postgresfriends(dot)org, pgsql-hackers(at)postgresql(dot)org, andrew(at)tao11(dot)riddles(dot)org(dot)uk, david(at)fetter(dot)org
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date: 2025-02-13 15:02:57
Message-ID: CAGMVOdtq=Uu-dfLxP3oJQ5fWv=wSuqwGq3P_OFq=+DS6K5vv5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 3, 2025 at 11:46 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
> > I've looked at it again and I think the code is correct, but I
> > miswrote that the array needs to be sorted. The above query returns:
> > x | y | nth_value
> > ---+---+-----------
> > 1 | 1 | 2
> > 2 | 2 | 1
> > 3 | | 2
> > 4 | 4 |
> > 5 | | 4
> > 6 | 6 | 7
> > 7 | 7 | 6
> > (7 rows)
> >
> > This is correct, for values of x:
> >
> > 1: The first non-null value of y is at position 0, however we have
> > EXCLUDE CURRENT ROW so it picks the next non-null value at position 1
> > and stores it in the array, returning 2.
> > 2: We can now take the first non-null value of y at position 0 and
> > store it in the array, returning 1.
> > 3. We take 1 preceding, using the position stored in the array, returning 2.
> > 4. 1 preceding and 1 following are both null, and we exclude the
> > current row, so returning null.
> > 5. 1 preceding is at position 3, store it in the array, returning 4.
> > 6. 1 preceding is null and we exclude the current row, so store
> > position 6 in the array, returning 7.
> > 7. 1 preceding is at position 5, store it in the array and return 6.
> >
> > It will be unordered when the EXCLUDE clause is used but the code
> > should handle this correctly.
>
> I ran this query (not using IGNORE NULLS) and get a result.
>
> SELECT
> x,
> nth_value(x,2) OVER w
> FROM generate_series(1,5) g(x)
> WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW);
> x | nth_value
> ---+-----------
> 1 | 3
> 2 | 3
> 3 | 2
> 4 | 3
> 5 | 4
> (5 rows)
>
> Since there's no NULL in x column, I expected the same result using
> IGNORE NULLS, but it was not:
>
> SELECT
> x,
> nth_value(x,2) IGNORE NULLS OVER w
> FROM generate_series(1,5) g(x)
> WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW);
> x | nth_value
> ---+-----------
> 1 | 3
> 2 | 4
> 3 | 4
> 4 | 3
> 5 | 4
> (5 rows)
>
> I suspect the difference is in the code path of
> ignorenulls_getfuncarginframe and the code path in
> WinGetFuncArgInFrame, which takes care of EXCLUDE like this.
>
> case FRAMEOPTION_EXCLUDE_CURRENT_ROW:
> if (abs_pos >= winstate->currentpos &&
> winstate->currentpos >= winstate->frameheadpos)
> abs_pos++;

Attached version doesn't use the nonnulls array if an Exclude is
specified, as I think it's not going to work with exclusions (as it's
only an optimization, this is ok and can be taken out entirely if you
prefer). I've also added your tests above to the tests.

Attachment Content-Type Size
0007-ignore-nulls.patch application/octet-stream 50.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-13 15:08:44 Re: [Feature Request] INSERT FROZEN to Optimize Large Cold Data Imports and Migrations
Previous Message Vladlen Popolitov 2025-02-13 14:49:38 Re: PoC. The saving of the compiled jit-code in the plan cache