Re: Custom Scan APIs (Re: Custom Plan node)

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Custom Scan APIs (Re: Custom Plan node)
Date: 2014-04-15 01:06:32
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9836C@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > [ assorted comments about custom-scan patch, but particularly ]
>
> > * The prune hook makes me feel very uneasy. It seems weirdly specific
> > implementation detail, made stranger by the otherwise lack of data
> > maintenance API calls. Calling that for every dirty page sounds like
> > an issue and my patch rejection indicator is flashing red around that.
>
> Yeah. After a fast review of the custom-scan and cache-scan patches, it
> seems to me that my original fears are largely confirmed: the custom scan
> patch is not going to be sufficient to allow development of any truly new
> plan type. Yeah, you can plug in some new execution node types, but actually
> doing anything interesting is going to require patching other parts of the
> system. Are we going to say to all comers, "sure, we'll put a hook call
> anywhere you like, just ask"? I can't see this as being the way to go.
>
Here is two different points to be discussed; one is generic to the custom-
plan API, and other is specific to my cache-only scan implementation.

Because existing plan/exec nodes are all built-in and some functional stuffs
are consolidated to a particular source file (like createplan.c, setrefs.c),
so it does not make problems if commonly called functions are declared as
static functions.
Custom-plan API changes this assumption, in other words, it allows to have
some portion of jobs in createplan.c or setrefs.c externally, so it needs
to have the commonly used functions being external.
Because I had try & error during development, I could not list up all the
functions to be public at once. However, it is not a fundamental matter,
should be solved during the discussion on pgsql-hackers.

Regarding to the specific portion in the cache-only scan, it may happen
if we want to create an extension that tracks vacuuming, independent from
custom-scan.
Usually, extension utilizes multiple hooks and interfaces to implement
the feature they want to do. In case of cache-only scan, unfortunately,
PG lacks a way to track heap vacuuming even though it needed to invalidate
cached data. It is unrelated issue from the custom-scan API. We may see
same problem if I tried to create an extension to count number of records
being vacuumed.

> Another way of describing the problem is that it's not clear where the API
> boundaries are for potential users of a custom-scan feature. (Simon said
> several things that are closely related to this point.) One thing I don't
> like at all about the patch is its willingness to turn anything whatsoever
> into a publicly exported function, which basically says that the design
> attitude is there *are* no boundaries. But that's not going to lead to
> anything maintainable. We're certainly not going to want to guarantee that
> these suddenly-exported functions will all now have stable APIs
> forevermore.
>
I'd like to have *several* existing static functions as (almost) stable
APIs, but not all. Indeed, my patch randomly might pick up static functions
to redefine as external functions, however, it does not mean custom-plan
eventually requires all the functions being external.
According to my investigation, here is two types of functions to be exposed.
- A function that walks on plan/exec node tree recursively
(Eg: create_plan_recurse)
- A function that adjusts internal state of the core backend
(Eg: fix_expr_common)

At least, these functions are not majority. I don't think it should be
a strong blocker of this new feature.
(I may have oversights of course, please point out.)

> Overall I concur with Simon's conclusion that this might be of interest
> for R&D purposes, but it's hard to see anyone wanting to support a production
> feature built on this. It would be only marginally less painful than
> supporting a patch that just adds the equivalent code to the backend in
> the traditional way.
>
As we adjusted FDW APIs through the first several releases, in general,
any kind of interfaces takes time to stabilize. Even though it *initially*
sticks on R&D purpose (I don't deny), it shall be brushed up to production
stage. I think a feature for R&D purpose is a good start-point.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jov 2014-04-15 01:06:40 PostgreSQL hang on FreeBSD,with CFLAGS='-O2 -pthread' workaround
Previous Message Bruce Momjian 2014-04-15 00:51:31 Re: [GENERAL] CLOB & BLOB limitations in PostgreSQL