Re: poc - possibility to write window function in PL languages

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: poc - possibility to write window function in PL languages
Date: 2021-01-20 08:11:33
Message-ID: CAFj8pRDtzddd_Xfx7TDw0ZPZs8LNk6KVfwS0HznRu=03FmXpCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

so 16. 1. 2021 v 0:09 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > [ plpgsql-window-functions-20210104.patch.gz ]
>
> I spent some time looking at this patch. It would certainly be
> appealing to have some ability to write custom window functions
> without descending into C; but I'm not very happy about the details.
>
> I'm okay with the idea of having a special variable of a new pseudotype.
> That's not exactly pretty, but it descends directly from how we handle
> the arguments of trigger functions, so at least there's precedent.
> What's bugging me though is the "typedvalue" stuff. That seems like a
> conceptual mess, a performance loss, and a permanent maintenance time
> sink. To avoid performance complaints, eventually this hard-wired set
> of conversions would have to bloom to cover every built-in cast, and
> as for extension types, you're just out of luck.
>

I invited typed values with an idea of larger usability. With this type we
can implement dynamic iteration over records better than now, when the
fields of records should be cast to text or json before operation. With
this type I can hold typed value longer time and I can do some like:

DECLARE var typedvalue;

var := fx(..);
IF var IS OF integer THEN
var_int := CAST(var AS int);
ELSEIF var IS OF date THEN
var_date := CAST(var AS date);
ELSE
var_text := CAST(var AS text);
END;

Sometimes (when you process some external data) this late (lazy) cast can
be better and allows you to use typed values. When I read external data,
sometimes I don't know types of these data before reading. I would like to
inject a possibility of more dynamic work with values and variables (but
still cleanly and safely). It should be more safe and faster than now, when
people should use the "text" type.

But I understand and I agree with your objections. Probably a lot of people
will use this type badly.

> One way to avoid that would be to declare the argument-fetching
> functions as polymorphics with a dummy argument that just provides
> the expected result type. So users would write something like
>
> create function pl_lag(x numeric)
> ...
> v := get_input_value_in_partition(windowobject, x, 1, -1,
> 'seek_current', false);
>
> where the argument-fetching function is declared
>
> get_input_value_in_partition(windowobject, anyelement, int, ...)
> returns anyelement
>
> and internally it could verify that the n'th window function argument
> matches the type of its second argument. While this could be made
> to work, it's kind of unsatisfying because the argument number "1" is
> so obviously redundant with the reference to "x". Ideally one should
> only have to write "x". I don't quite see how to make that work,
> but maybe there's a way?
>
> On the whole though, I think your original idea of bespoke plpgsql
> syntax is better, ie let's write something like
>
> GET WINDOW VALUE v := x AT PARTITION CURRENT(-1);
>
> and hide all the mechanism behind that. The reference to "x" is enough
> to provide the argument number and type, and the window object doesn't
> have to be explicitly visible at all.
>

yes, this syntax looks well.

The second question is work with partition context value. This should be
only one value, and of only one but of any type per function. In this case
we cannot use GET statements. I had an idea of enhancing declaration. Some
like

DECLARE
pcx PARTITION CONTEXT (int); -- read partition context
BEGIN
pcx := 10; -- set partition context

What do you think about it?

Regards

Pavel

> Yeah, this will mean that anybody who wants to provide equivalent
> functionality in some other PL will have to do more work. But it's
> not like it was going to be zero effort for them before. Furthermore,
> it's not clear to me that other PLs would want to adopt your current
> design anyway. For example, I bet PL/R would like to somehow make
> window arguments map into vectors on the R side, but there's no chance
> of that with this SQL layer in between.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-01-20 08:12:09 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Wesley Aptekar-Cassels 2021-01-20 08:02:16 Getting column names/types from select query?