Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: ojford(at)gmail(dot)com
Cc: krasiyan(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, vik(at)postgresfriends(dot)org, pgsql-hackers(at)postgresql(dot)org, andrew(at)tao11(dot)riddles(dot)org(dot)uk, david(at)fetter(dot)org
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date: 2025-01-27 11:51:15
Message-ID: 20250127.205115.1603241525112208994.ishii@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The attached patch should fix both of these. I've added extra tests
> with a PARTITION BY in the window clause to test for multiple
> partitions.

I have looked through the v5 patch. Here are review comments.

From 5268754b33103fefc511b57ec546103899f70dbe Mon Sep 17 00:00:00 2001
From: Oliver Ford <ojford(at)gmail(dot)com>
Date: Thu, 23 Jan 2025 20:11:17 +0000
Subject: [PATCH] :ignore nulls

---
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 9a1acce2b5..d93a44633e 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -69,6 +69,10 @@ typedef struct WindowObjectData
int readptr; /* tuplestore read pointer for this fn */
int64 markpos; /* row that markptr is positioned on */
int64 seekpos; /* row that readptr is positioned on */
+ int ignore_nulls; /* ignore nulls */
+ int64 *win_nonnulls; /* tracks non-nulls in ignore nulls mode */

After ignore_nulls, there will be a 4-byte hole because win_nonnulls
is an 8-byte variable. It would be better to swap them.

@@ -1263,6 +1268,15 @@ begin_partition(WindowAggState *winstate)

winobj->markpos = -1;
winobj->seekpos = -1;
+
+
+ /* reallocate null check */
+ if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
+ {
+ perfuncstate->winobj->win_nonnulls = palloc_array(int64, 16);
+ perfuncstate->winobj->nonnulls_size = 16;

Those 2 lines above are not necessary. Since win_nonnulls are
allocated in ExecInitWindowAgg() in the per query query context, it
survives across partitions. You only need initialize nonnulls_len to
0.

+ perfuncstate->winobj->nonnulls_len = 0;
+ }
}
}

@@ -1383,7 +1397,9 @@ release_partition(WindowAggState *winstate)

/* Release any partition-local state of this window function */
if (perfuncstate->winobj)
+ {
perfuncstate->winobj->localmem = NULL;
+ }

You accidentally added unnecessary curly braces.

@@ -2679,6 +2698,13 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
winobj->argstates = wfuncstate->args;
winobj->localmem = NULL;
perfuncstate->winobj = winobj;
+ winobj->ignore_nulls = wfunc->ignore_nulls;
+ if (winobj->ignore_nulls == PARSER_IGNORE_NULLS)
+ {
+ winobj->win_nonnulls = palloc_array(int64, 16);
+ winobj->nonnulls_size = 16;
+ winobj->nonnulls_len = 0;
+ }

I don't like to see two "16" here. It would better to use #define or
something.

It will be better to declare the prototype of increment_nonnulls,
ignorenulls_getfuncarginpartition, and ignorenulls_getfuncarginframe
in the begging of the file as other static functions already do.

+/*
+ * ignorenulls_getfuncarginframe
+ * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
+ * until we find a value or reach the frame's end.
+ */
+static Datum
+ignorenulls_getfuncarginframe(WindowObject winobj, int argno,

Do you assume that win_nonnulls is sorted by pos? I think it's
necessarily true that pos in win_nonnulls array is sorted. Is that ok?

+ /*
+ * Store previous rows. Only possible in SEEK_HEAD mode
+ */
+ for (i = 0; i < winobj->nonnulls_len; ++i)
+ {
+ int inframe;
+
+ if (winobj->win_nonnulls[i] < winobj->markpos)

There are too many "winobj->win_nonnulls[i]". You could assign to a
variable "winobj->win_nonnulls[i]" and use the variable.

+ continue;
+ if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
+ continue;
+
+ inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
+ if (inframe <= 0)
+ {
+ if (inframe == -1 && set_mark)
+ WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);

I think in most cases inframe returns 0 and WinSetMarkPosition is not
called. What use case do you have in your mind when inframe is -1?

+check_frame:
+ do
+ {
+ int inframe;
+
+ if (!window_gettupleslot(winobj, abs_pos, slot))
+ goto out_of_frame;
+
+ inframe = row_is_in_frame(winstate, abs_pos, slot);
+ if (inframe == -1)
+ goto out_of_frame;
+ else if (inframe == 0)
+ goto advance;
+
+ gottuple = window_gettupleslot(winobj, abs_pos, slot);

Do you really need to call window_gettupleslot here? It's already
called above.

--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -576,6 +576,18 @@ typedef struct GroupingFunc
* Collation information is irrelevant for the query jumbling, as is the
* internal state information of the node like "winstar" and "winagg".
*/
+
+/*
+ * Null Treatment options. If specified, initially set to PARSER_IGNORE
+ * or PARSER_RESPECT. PARSER_IGNORE_NULLS is then converted to IGNORE_NULLS
+ * if the window function allows the null treatment clause.
+ */
+#define IGNORE_NULLS 4
+#define RESPECT_NULLS 3
+#define PARSER_IGNORE_NULLS 2
+#define PARSER_RESPECT_NULLS 1
+#define NO_NULLTREATMENT 0

This looks strange to me. Why do you start the define value from 4
down to 0? Also there is no place to use RESPECT_NULLS. Do we need it?

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2025-01-27 12:42:18 Re: Virtual generated columns
Previous Message Daniil Davydov 2025-01-27 11:38:47 Accessing an invalid pointer in BufferManagerRelation structure