Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

From: Oliver Ford <ojford(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: krasiyan(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, vik(at)postgresfriends(dot)org, pgsql-hackers(at)postgresql(dot)org, andrew(at)tao11(dot)riddles(dot)org(dot)uk, david(at)fetter(dot)org
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date: 2025-01-27 16:19:30
Message-ID: CAGMVOdtZ02fY7SM3b09mq15z64zDa8YWYdUt2CK+XwfPuHLyJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 27, 2025 at 11:51 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
> I have looked through the v5 patch. Here are review comments.

New version attached.

> @@ -69,6 +69,10 @@ typedef struct WindowObjectData
> int readptr; /* tuplestore read pointer for this fn */
> int64 markpos; /* row that markptr is positioned on */
> int64 seekpos; /* row that readptr is positioned on */
> + int ignore_nulls; /* ignore nulls */
> + int64 *win_nonnulls; /* tracks non-nulls in ignore nulls mode */
>
> After ignore_nulls, there will be a 4-byte hole because win_nonnulls
> is an 8-byte variable. It would be better to swap them.

Done.

> @@ -1263,6 +1268,15 @@ begin_partition(WindowAggState *winstate)
>
> winobj->markpos = -1;
> winobj->seekpos = -1;
> +
> +
> + /* reallocate null check */
> + if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
> + {
> + perfuncstate->winobj->win_nonnulls = palloc_array(int64, 16);
> + perfuncstate->winobj->nonnulls_size = 16;
>
> Those 2 lines above are not necessary. Since win_nonnulls are
> allocated in ExecInitWindowAgg() in the per query query context, it
> survives across partitions. You only need initialize nonnulls_len to
> 0.

Done.

> @@ -1383,7 +1397,9 @@ release_partition(WindowAggState *winstate)
>
> /* Release any partition-local state of this window function */
> if (perfuncstate->winobj)
> + {
> perfuncstate->winobj->localmem = NULL;
> + }
>
> You accidentally added unnecessary curly braces.

Removed.

> @@ -2679,6 +2698,13 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
> winobj->argstates = wfuncstate->args;
> winobj->localmem = NULL;
> perfuncstate->winobj = winobj;
> + winobj->ignore_nulls = wfunc->ignore_nulls;
> + if (winobj->ignore_nulls == PARSER_IGNORE_NULLS)
> + {
> + winobj->win_nonnulls = palloc_array(int64, 16);
> + winobj->nonnulls_size = 16;
> + winobj->nonnulls_len = 0;
> + }
>
> I don't like to see two "16" here. It would better to use #define or
> something.
>
> It will be better to declare the prototype of increment_nonnulls,
> ignorenulls_getfuncarginpartition, and ignorenulls_getfuncarginframe
> in the begging of the file as other static functions already do.

Made a new define and added declarations.

> +/*
> + * ignorenulls_getfuncarginframe
> + * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
> + * until we find a value or reach the frame's end.
> + */
> +static Datum
> +ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
>
> Do you assume that win_nonnulls is sorted by pos? I think it's
> necessarily true that pos in win_nonnulls array is sorted. Is that ok?

Yes it must be sorted on my understanding of the code.

> + /*
> + * Store previous rows. Only possible in SEEK_HEAD mode
> + */
> + for (i = 0; i < winobj->nonnulls_len; ++i)
> + {
> + int inframe;
> +
> + if (winobj->win_nonnulls[i] < winobj->markpos)
>
> There are too many "winobj->win_nonnulls[i]". You could assign to a
> variable "winobj->win_nonnulls[i]" and use the variable.

Done.

> + continue;
> + if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
> + continue;
> +
> + inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
> + if (inframe <= 0)
> + {
> + if (inframe == -1 && set_mark)
> + WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);
>
> I think in most cases inframe returns 0 and WinSetMarkPosition is not
> called. What use case do you have in your mind when inframe is -1?

Removed.

> +check_frame:
> + do
> + {
> + int inframe;
> +
> + if (!window_gettupleslot(winobj, abs_pos, slot))
> + goto out_of_frame;
> +
> + inframe = row_is_in_frame(winstate, abs_pos, slot);
> + if (inframe == -1)
> + goto out_of_frame;
> + else if (inframe == 0)
> + goto advance;
> +
> + gottuple = window_gettupleslot(winobj, abs_pos, slot);
>
> Do you really need to call window_gettupleslot here? It's already
> called above.

Removed.

> --- a/src/include/nodes/primnodes.h
> +++ b/src/include/nodes/primnodes.h
> @@ -576,6 +576,18 @@ typedef struct GroupingFunc
> * Collation information is irrelevant for the query jumbling, as is the
> * internal state information of the node like "winstar" and "winagg".
> */
> +
> +/*
> + * Null Treatment options. If specified, initially set to PARSER_IGNORE
> + * or PARSER_RESPECT. PARSER_IGNORE_NULLS is then converted to IGNORE_NULLS
> + * if the window function allows the null treatment clause.
> + */
> +#define IGNORE_NULLS 4
> +#define RESPECT_NULLS 3
> +#define PARSER_IGNORE_NULLS 2
> +#define PARSER_RESPECT_NULLS 1
> +#define NO_NULLTREATMENT 0
>
> This looks strange to me. Why do you start the define value from 4
> down to 0? Also there is no place to use RESPECT_NULLS. Do we need it?

Removed RESPECT_NULLS and started from 0.

Attachment Content-Type Size
0006-ignore-nulls.patch application/octet-stream 49.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-01-27 17:03:53 Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Previous Message Corey Huinker 2025-01-27 16:09:31 Re: Statistics Import and Export