Quick Links

Re: Custom Scan APIs (Re: Custom Plan node)

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject:	Re: Custom Scan APIs (Re: Custom Plan node)
Date:	2014-02-26 04:22:21
Message-ID:	20140226042221.GJ2921@tamriel.snowman.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > Instead of custom node, it might be better idea to improve FDW infrastructure
> > to push join. For the starters, is it possible for the custom scan node
> > hooks to create a ForeignScan node? In general, I think, it might be better
> > for the custom scan hooks to create existing nodes if they serve the purpose.
> >
> It does not work well because existing FDW infrastructure is designed to
> perform on foreign tables, not regular tables. Probably, it needs to revise
> much our assumption around the background code, if we re-define the purpose
> of FDW infrastructure. For example, ForeignScan is expected to return a tuple
> according to the TupleDesc that is exactly same with table definition.
> It does not fit the requirement if we replace a join-node by ForeignScan
> because its TupleDesc of joined relations is not predefined.

I'm not following this logic at all- how are you defining "foreign" from
"regular"? Certainly, in-memory-only tables which are sitting out in
some non-persistent GPU memory aren't "regular" by any PG definition.
Perhaps you can't make ForeignScan suddenly work as a join-node
replacement, but I've not seen where anyone has proposed that (directly-
I've implied it on occation where a remote view can be used, but that's
not the same thing as having proper push-down support for joins).

> I'd like to define these features are designed for individual purpose.

My previous complaint about this patch set has been precisely that each
piece seems to be custom-built and every patch needs more and more
backend changes. If every time someone wants to do something with this
CustomScan API, they need changes made to the backend code, then it's
not a generally useful external API. We really don't want to define
such an external API as then we have to deal with backwards
compatibility, particularly when it's all specialized to specific use
cases which are all different.

> FDW is designed to intermediate an external data source and internal heap
> representation according to foreign table definition. In other words, its
> role is to generate contents of predefined database object on the fly.

There's certainly nothing in the FDW API which requires that the remote
side have an internal heap representation, as evidenced by the various
FDWs which already exist and certainly are not any kind of 'normal'
heap. Every query against the foriegn relation goes through the FDW API
and can end up returning whatever the FDW author decides is appropriate
to return at that time, as long as it matches the tuple description-
which is absolutely necessary for any kind of sanity, imv.

> On the other hands, custom-scan is designed to implement alternative ways
> to scan / join relations in addition to the methods supported by built-in
> feature.

I can see the usefulness in being able to push down aggregates or other
function-type calls to the remote side of an FDW and would love to see
work done along those lines, along with the ability to push down joins
to remote systems- but I'm not convinced that the claimed flexibility
with the CustomScan API is there, given the need to continue modifying
the backend code for each use-case, nor that there are particularly new
and inventive ways of saying "find me all the cases where set X overlaps
with set Y". I'm certainly open to the idea that we could have an FDW
API which allows us to ask exactly that question and let the remote side
cost it out and give us an answer for a pair of relations but that isn't
what this is. Note also that in any kind of aggregation push-down we
must be sure that the function is well-defined and that the FDW is on
the hook to ensure that the returned data is the same as if we ran the
same aggregate function locally, otherwise the results of a query might
differ based on if the aggregate was fired locally or remotely (which
could be influenced by costing- eg: the size of the relation or its
statistics).

> I'm motivated to implement GPU acceleration feature that works transparently
> for application. Thus, it has to be capable on regular tables, because most
> of application stores data on regular tables, not foreign ones.

You want to persist that data in the GPU across multiple calls though,
which makes it unlike any kind of regular PG table and much more like
some foreign table. Perhaps the data is initially loaded from a local
table and then updated on the GPU card in some way when the 'real' table
is updated, but neither of those makes it a "regular" PG table.

> > Since a custom node is open implementation, it will be important to pass
> > as much information down to the hooks as possible; lest the hooks will be
> > constrained. Since the functions signatures within the planner, optimizer
> > will change from time to time, so the custom node hook signatures will need
> > to change from time to time. That might turn out to be maintenance overhead.

It's more than "from time-to-time", it was "for each use case in the
given patch set asking for this feature", which is why I'm pushing back
on it.

> Yes. You are also right. But it also makes maintenance overhead if hook has
> many arguments nobody uses.

I can agree with this- there should be a sensible API if we're going to
do this.

> Probably, it makes sense to list up the arguments that cannot be reproduced
> from other information, can be reproduced but complicated steps, and can be
> reproduced easily.

This really strikes me as the wrong approach for an FDW join-pushdown
API, which should be geared around giving the remote side an opportunity
on a case-by-case basis to cost out joins using whatever methods it has
available to implement them. I've outlined above the reasons I don't
agree with just making the entire planner/optimizer pluggable.

Thanks,

Stephen

In response to

Re: Custom Scan APIs (Re: Custom Plan node) at 2014-02-25 10:09:50 from Kouhei Kaigai

Responses

Re: Custom Scan APIs (Re: Custom Plan node) at 2014-02-26 06:50:32 from Kouhei Kaigai

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2014-02-26 04:33:38	Re: jsonb and nested hstore
Previous Message	Craig Ringer	2014-02-26 04:07:45	Re: jsonb and nested hstore