From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Custom Scans and private data |
Date: | 2015-08-27 23:53:51 |
Message-ID: | 20150827235351.GA4147@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2015-08-27 18:59:13 -0400, Tom Lane wrote:
> But I do think that letting custom-scan extensions fool about with
> the semantics of plan-copying (and plan-serialization for that matter)
> is a risky choice that is not justified by some marginal gains in code
> readability
To me the likelihood of having bugs in manual (de)serilization via lists
et al. is higher than introducing bugs via copyfuncs. Or even
out/readfuncs, especially as we could easily add a bunch of
verifications to the latter by verifying field names in READ_*. Wether a
prepared statement fails because a copyfuncs hook failed, or whether
it's in the fdw itself doesn't seem to make that much of a difference.
> especially when there are other feasible ways to attack the
> readability problem.
Lets at least add a Value variant that can store strings that aren't
null terminated, that'd already make things a lot better. Right now you
need to reuse Const nodes and stuff a bytea in there or such.
But even then, that only works if you have a single flat datastructure
that doesn't have any pointers in it. copyObject() style copying can
easily cope with that, using a binary blob really can't.
As far as I can see you right now basically need to write code for
custom scans so that between planning and execution you unconditionally
do:
Planning:
1) create your internal data
2) serialize data,
Execution:
3) unserialize data
4) use unserialized data
There's no way to stash data between 2 and 3 anywhere, even if there's
no need for the whole copying rigamarole because it was just a plain
statement.
If you look at the various *Scan, *Join etc. nodes we have in core, most
of them have a bunch of variable length data structures and pointers in
them. All that you can't sanely do for a custom scan node.
> copyObject needs to be a pretty self-contained operation with not a
> lot of external dependencies
Why?
> > [ ruminations about how to improve the system's extensibility ]
>
> Yeah, I've entertained similar thoughts in the past. It's not very clear
> how we get there from here, though --- for instance, any fundamental
> change in the Node system would break so much code that it's unlikely
> anyone would thank us for it, even assuming we ever finished the
> conversion project.
How about adding a single 'ExtensibleNode' node type (inheriting from
Node as normal). Many of the switches over node types would gain one
additional entry for that, and do something like
case T_ExtensibleNode:
handler = LookupHandler((ExtensibleNode *) node)->extended_type, Handler_ExecInitNode);
if (handler)
{
result = ((ExecInitNodeHandler *) handler)(node, estate, eflags);
break;
}
/* fall through to error */
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(node));
that should be doable one-by-one in many of these bigger
switches. There's obviously some limits where that's doable, but I don't
think it'd break much code?
How exactly to identify extended_type in a way that makes sense
e.g. between restarting pg, backends, primary/standby isn't trivial, but
I do think it should be doable.
> I'm also less than convinced that "I should be able to install major new
> features without touching the core code" is actually a sane or useful goal
> to have.
Yea, I have some doubts there as well. There's a lot of features though
which we really don't want in core, where some of the limitations of the
node system really do become problematic.
> As a concrete example, you mentioned the lack of extensibility of the
> bison parser, which is undoubtedly a blocking issue if you want a new
> feature that would require new SQL syntax.
While that's a problem, where I really don't have any good answer, I
also think in many cases it's primarily DDL, and primarily extending
existing DDL. By far the most things I've seen want to add informations
to tables and/or columns - and storage options actually kinda give you
that on the grammar level. We just don't allow you too sanely hook into
it, and there's a bunch of other pointless restrictions.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-08-28 00:01:12 | Re: PostgreSQL for VAX on NetBSD/OpenBSD |
Previous Message | Tom Lane | 2015-08-27 23:28:25 | Re: PostgreSQL for VAX on NetBSD/OpenBSD |