Re: [PATCH] Generic type subscription

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Victor Wagner <vitus(at)wagner(dot)pp(dot)ru>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Subject: Re: [PATCH] Generic type subscription
Date: 2017-01-27 17:31:41
Message-ID: CA+q6zcW9-QyOdiK8T4v+f3EttaU+x9N=sFNyycWFn=m7BB=tAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 24 January 2017 at 02:07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I took an extremely quick look over the patch

Thank you for the feedback. It took some time for me to think about all
suggestions and notes.

> 1. As I mentioned previously, it's a seriously bad idea that ArrayRef
> is used for both array subscripting and array assignment. Let's fix
> that while we're at it, rather than setting that mistake in even more
> stone by embedding it in a datatype extension API.

Sure. But I assume that `SubscriptingRef` and `SubscriptingAssignmentRef`
will
be almost identical since they carry the same information to get a value
and to assign a new value (so, probably it will be just an alias with a
different related function).

> 2. I'm not very pleased that the subscripting functions have signature
> "subscripting(internal) returns internal";

Basically, current implementation of subscripting operation contains node
related logic (e.g. like to verify that we're not using slice syntax for
jsonb)
and data type related logic (e.g. to get/to assign a value in an array).
And if
it's easy enough to use:
`array_subscript(anyarray, internal) returns anyelement`
`array_assign(anyarray, internal, anyelement) returns anyarray`
form for the second one, the first one should accept node as an argument and
return node - I'm not sure if it's possible to use something else than
`internal` here. Speaking about other signature issues, sure, I'll fix them.

> 3. The contents of ArrayRef are designed on the assumption that the same
> typmod and collation values apply to both an array and its elements.

Yes, I missed that. It looks straightforward for me, we can just split
`refcollid` and `reftypmod` to `refcontainercollid`, `refelementcollid` and
`refcontainertypmod`, `refelementtypmod`.

> 4. It looks like your work on the node processing infrastructure has been
> limited to s/ArrayRef/SubscriptingRef/g, but that's not nearly enough.
> SubscriptingRef needs to be regarded as an opportunity to invoke a
> user-defined function, which means that it now acts quite a bit like
> FuncExpr. For example, the function-to-be-invoked needs to be checked for
> volatility, leakproofness, parallel safety, etc in operations that want to
> know about such things.
> ....
> I noted yesterday, you're missing a lot of plan-time manipulations that
need
> to happen for a generic function call.

Yes, I totally missed these too. I'm going to improve this situation soon.

> Actually, after thinking about that a bit more: you've really squeezed
> *three* different APIs into one function. Namely, subscript-reference
> parse analysis, array subscripting execution, and array assignment
> execution. It would be cleaner, and would reduce runtime overhead a bit,
> if those were embodied as three separate functions.
>
> It might be possible to get away with having only one pg_type column,
> pointing at the parse-analysis function. That function would generate
> a SubscriptingRef tree node containing the OID of the appropriate
> execution function, which execQual.c could call without ever knowing
> its name explicitly.
>
> This clearly would work for built-in types, since the parse-analysis
> function could rely on fmgroids.h for the OIDs of the execution functions.
> But I'm less sure if it would work in extensions, which won't have
> predetermined OIDs for their functions. Is there a way for a function
> in an extension to find the OID of one of its sibling functions?
>
> On 24 January 2017 at 07:54, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>
> Obviously there's regprocedure (or it's C equivalent), but then you're
stuck
> re-computing at runtime. I've messed around with that a bit in an effort
to
> have an extension depend on another extension that allows the user to
specify
> it's schema. If you're already doing metaprogramming it's not an enormous
> problem... if you're not already doing that it sucks. Trying to make that
> work in C would be somewhere between impossible and a nightmare.

The idea of having one function that will generate SubscriptingRef node
sounds
good. But I'm afraid of consequences about using it for extensions
(especially
since the request for general subscripting implementation came also from
their side). Is there a way to make it less troublesome?

To summarize, right now there are three options to handle a
`SubscriptingRef`
node analysis, subscripting execution and assignment execution:

* one `pg_type` column with an OID of corresponding function for each
purpose
(which isn't cool)

* one "controller" function that will call directly another function with
required logic (which is a "squeezing" and it's the current
implementation)

* one function that will return `SubscriptingRef` with an OID of required
function (which is potentially troublesome for extensions)

> BTW, a different approach that might be worth considering is to say that
> the nodetree representation of one of these things *is* a FuncExpr, and
> the new magic thing is just that we invent a new CoercionForm value
> which causes ruleutils.c to print the expression as a subscripting op.
>
> Leave ArrayRef strictly alone, and introduce new infrastructure beside it
for
> non-array containers.

Yes, it will eliminate any objections about an overhead to existing array
operations. But I'm concerned that it will "virtually" reduce flexibility of
the solution on the whole, because it means we think only about one data
type
as a target for implementation and it's easy to miss something for more
complex
(in terms of subscripting logic) data structures.

> Let the node representation for non-array-container access be FuncExpr
with
> new value(s) of funcformat.

Probably I misunderstood something, but if we want to keep all necessary
information about subscripting (like index value, slices etc.) together with
OID's of required functions, it will be almost the same as `SubscriptingRef`
isn't it? For me the current implementation looks more natural - there was a
component that was responsible for the subscripting for one data type, and
it's
implementation evolved and became more general (but of course that's
probably
just because I implemented this and it looks more natural only for me).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-01-27 17:32:01 Re: pg_ls_dir & friends still have a hard-coded superuser check
Previous Message Simon Riggs 2017-01-27 17:21:15 Re: pg_ls_dir & friends still have a hard-coded superuser check