From: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c) |
Date: | 2016-02-08 00:28:45 |
Message-ID: | 9A28C8860F777E439AA12E8AEA7694F8011AA2AE@BPXM15GP.gisp.nec.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Wed, Feb 3, 2016 at 11:57 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > At this moment, I tried to write up description at nodes/nodes.h.
> > The amount of description is about 100lines. It is on a borderline
> > whether we split off this chunk into another header file, in my sense.
> >
> >
> > On the other hands, I noticed that we cannot omit checks for individual
> > callbacks on CustomXXXX node type, ExtensibleNodeMethods is embedded in
> > the CustomXXXXMethods structure, thus we may have CustomXXXX node with
> > no extensible feature.
> > This manner is beneficial because extension does not need to register
> > the library and symbol name for serialization. So, CustomScan related
> > code still checks existence of individual callbacks.
>
> I was looking over this patch yesterday, and something was bugging me
> about it, but I couldn't quite put my finger on what it was. But now
> I think I know.
>
> I think of an extensible node type facility as something that ought to
> be defined to allow a user to create new types of nodes. But this is
> not that. What this does is allows you to have a CustomScan or
> ForeignScan node - that is, the existing node type - which is actually
> larger than a normal node of that type and has some extra data that is
> part of it. I'm having a really hard time being comfortable with that
> concept. Somehow, it seems like the node type should determine the
> size of the node. I can stretch my brain to the point of being able
> to say, well, maybe if the node tag is T_ExtensibleNode, then you can
> look at char *extnodename to figure out what type of node it is
> really, and then from there get the size. But what this does is:
> every time you see a T_CustomScan or T_ForeignScan node, it might not
> really be that kind of node but something else else, and tomorrow
> there might be another half-dozen node types with a similar property.
> And every one of those node types will have char *extnodename in a
> different place in the structure, so a hypothetical piece of code that
> wanted to find the extension methods for a node, or the size of a
> node, would need a switch that knows about all of those node types.
> It feels very awkward.
>
> So I have a slightly different proposal. Let's forget about letting
> T_CustomScan or T_ForeignScan or any other built-in node type vary in
> size. Instead, let's add T_ExtensibleNode which acts as a completely
> user-defined node. It has read/out/copy/equalfuncs support along the
> lines of what this patch implements, and that's it. It's not a scan
> or a path or anything like that: it's just an opaque data container
> that the system can input, output, copy, and test for equality, and
> nothing else. Isn't that useless? Not at all. If you're creating an
> FDW or custom scan provider and want to store some extra data, you can
> create a new type of extensible node and stick it in the fdw_private
> or custom_private field! The data won't be part of the CustomScan or
> ForeignScan structure itself, as in this patch, but who cares? The
> only objection I can see is that you still need several pointer
> deferences to get to the data since fdw_private is a List, but if
> that's actually a real performance problem we could probably fix it by
> just changing fdw_private to a Node instead. You'd still need one
> pointer dereference, but that doesn't seem too bad.
>
> Thoughts?
>
The new callbacks of T_ExtensibleNode will replace the necessity to
form and deform process of private values, like as:
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L114
It transforms a bunch of internal data of CustomScan (similar to the
extended fields in T_ExtensibleNode) to/from the node functions
understandable forms for copy, input and output support.
I think it implies you proposition is workable.
I'd like to follow this proposition basically.
On the other hands, I have two points I want to pay attention.
1. At this moment, it is allowed to define a larger structure that
embeds CustomPath and CustomScanState by extension. How do we treat
this coding manner in this case? Especially, CustomScanState has no
private pointer dereference because it assumes an extension define
a larger structure. Of course, we don't need to care about node
operations on Path and PlanState nodes, but right now.
2. I intended to replace LibraryName and SymbolName fields from the
CustomScanMethods structure by integration of extensible node type.
We had to give a pair of these identifiers because custom scan provider
has no registration points at this moment. A little concern is extension
has to assume a particular filename of itself.
But, probably, it shall be a separated discussion. My preference is
preliminary registration of custom scan provider by its name, as well
as extensible node.
Towards the last question; whether *_private shall be void * or List *,
I want to keep fdw_private and custom_private as List * pointer, because
a new node type definition is a bit overdone job if this FDW or CSP will
take only a few private fields with primitive data types.
It is a preferable features when extension defines ten or more private
fields.
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2016-02-08 00:50:44 | Re: Using quicksort for every external sort run |
Previous Message | Robert Haas | 2016-02-07 22:01:36 | Re: Parallel Aggregate |