Re: How to retain lesser paths at add_path()?

From: Kohei KaiGai <kaigai(at)heterodb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Richard Guo <riguo(at)pivotal(dot)io>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to retain lesser paths at add_path()?
Date: 2019-10-07 16:04:45
Message-ID: CAOP8fzY8QfTp=e0_kA-HDVs=43DGUSzLmPNhrZyjucTjik8RoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2019年10月7日(月) 23:44 Robert Haas <robertmhaas(at)gmail(dot)com>:
>
> On Mon, Oct 7, 2019 at 9:56 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > We could imagine, maybe, that a hook for the purpose of allowing an
> > additional dimension to be considered would be essentially a path
> > comparison function, returning -1, +1, or 0 depending on whether
> > path A is dominated by path B (on this new dimension), dominates
> > path B, or neither. However, I do not see how multiple extensions
> > could usefully share use of such a hook.
>
> Typically, we support hook-sharing mostly by ad-hoc methods: when
> installing a hook, you remember the previous value of the function
> pointer, and arrange to call that function yourself. That's not a
> great method. One problem with it is that you can't reasonably
> uninstall a hook function, which would be a nice thing to be able to
> do. We could do better by reusing the technique from on_shmem_exit or
> RegisterXactCallbacks: keep an array of pointers, and call them in
> order. I wish we'd retrofit all of our hooks to work more like that;
> being able to unload shared libraries would be a good feature.
>
> But if we want to stick with the ad-hoc method, we could also just
> have four possible return values: dominates, dominated-by, both, or
> neither.
>
It seems to me this is a bit different from the purpose of this hook.
I never intend to overwrite existing cost-based decision by this hook.
The cheapest path at a particular level is the cheapest one regardless
of the result of this hook. However, this hook enables to prevent
immediate elimination of a particular path that we (extension) want to
use it later and may have potentially cheaper cost (e.g; a pair of
custom GpuAgg + GpuJoin by reduction of DMA cost).

So, I think expected behavior when multiple extensions would use
this hook is clear. If any of call-chain on the hook wanted to preserve
the path, it should be kept on the pathlist. (Of couse, it is not a cheapest
one)

> Still, this doesn't feel like very scalable paradigm, because this
> code gets called a lot. Unless both calling the hook functions and
> the hook functions themselves are dirt-cheap, it's going to hurt, and
> TBH, I wonder if even the cost of detecting that the hook is unused
> might be material.
>
> I wonder whether we might get a nicer solution to this problem if our
> method of managing paths looked less something invented by a LISP
> hacker, but I don't have a specific proposal in mind.
>
One other design in my mind is, add a callback function pointer on Path
structure. Only if Path structure has valid pointer (not-NULL), add_path()
calls extension's own logic to determine whether the Path can be
eliminated now.
This design may minimize the number of callback invocation.

One potential downside of this approach is, function pointer makes
hard to implement readfuncs of Path nodes, even though we have
no read handler of them, right now.

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai(at)heterodb(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-10-07 16:14:52 Re: abort-time portal cleanup
Previous Message Tom Lane 2019-10-07 15:57:36 Re: How to retain lesser paths at add_path()?