allowing extensions to control planner behavior

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: allowing extensions to control planner behavior
Date: 2024-08-26 16:32:53
Message-ID: CA+TgmoZY+baV-T-5ifDn6P=L=aV-VkVBrPmi0TQkcEq-5Finww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm somewhat expecting to be flamed to a well-done crisp for saying
this, but I think we need better ways for extensions to control the
behavior of PostgreSQL's query planner. I know of two major reasons
why somebody might want to do this. First, you might want to do
something like what pg_hint_plan does, where it essentially implements
Oracle-style hints that can be either inline or stored in a side table
and automatically applied to queries.[1] In addition to supporting
Oracle-style hints, it also supports some other kinds of hints so that
you can, for example, try to fix broken cardinality estimates. Second,
you might want to convince the planner to keep producing the same kind
of plan that it produced previously. I believe this is what Amazon's
query plan management feature[2] does, although since it is closed
source and I don't work at Amazon maybe it's actually implemented
completely differently. Regardless of what Amazon did in this case,
plan stability is a feature people want. Just trying to keep using the
same plan data structure forever doesn't seem like a good strategy,
because for example it would be fragile in the case of any DDL
changes, like dropping and recreating an index, or dropping or adding
a column. But you might want conceptually the same plan. Although it's
not frequently admitted on this mailing list, unexpected plan changes
are a frequent cause of sudden database outages, and wanting to
prevent that is a legitimate thing for a user to try to do. Naturally,
there is a risk that you might in so doing also prevent plan changes
that would have dramatically improved performance, or stick with a
plan long after you've outgrown it, but that doesn't stop people from
wanting it, or other databases (or proprietary forks of this database)
from offering it, and I don't think it should.

We have some hooks right now that offer a few options in this area,
but there are problems. The hook that I believe to be closest to the
right thing is this one:

/*
* Allow a plugin to editorialize on the set of Paths for this base
* relation. It could add new paths (such as CustomPaths) by calling
* add_path(), or add_partial_path() if parallel aware. It could also
* delete or modify paths added by the core code.
*/
if (set_rel_pathlist_hook)
(*set_rel_pathlist_hook) (root, rel, rti, rte);

Unfortunately, the part about the hook having the freedom to delete
paths isn't really true. Perhaps technically you can delete a path
that you don't want to be chosen, but any paths that were dominated by
the path you deleted have already been thrown away and it's too late
to get them back. You can modify paths if you don't want to change
their costs, but if you change their costs then you have the same
problem: the contents of the pathlist at the time that you see it are
determined by the costs that each path had when it was initially
added, and it's really too late to editorialize on that. So all you
can really do here in practice is add new paths.
set_join_pathlist_hook, which applies to joinrels, is similarly
limited. appendrels don't even have an equivalent of this hook.

So, how could we do better?

I think there are two basic approaches that are possible here. If
someone sees a third option, let me know. First, we could allow users
to hook add_path() and add_partial_path(). That certainly provides the
flexibility on paper to accept or reject whatever paths you do or do
not want. However, I don't find this approach very appealing. One
problem is that it's likely to be fairly expensive, because add_path()
gets called A LOT. A second problem is that you don't actually get an
awful lot of context: I think anybody writing a hook would have to
write code to basically analyze each proposed path and figure out why
it was getting added and then decide what to do. In some cases that
might be fine, because for example accepting or rejecting paths based
on path type seems fairly straightforward with this approach, but as
soon as you want to do anything more complicated than that it starts
to seem difficult. If, for example, you want relation R1 to be the
driving table for the whole query plan, you're going to have to
determine whether or not that is the case for every single candidate
(partial) path that someone hands you, so you're going to end up
making that decision a whole lot of times. It doesn't sound
particularly fun. Third, even if you are doing something really simple
like trying to reject mergejoins, you've already lost the opportunity
to skip a bunch of work. If you had known when you started planning
the joinrel that you didn't care about mergejoins, you could have
skipped looking for merge-joinable clauses. Overall, while I wouldn't
be completely against further exploration of this option, I suspect
it's pretty hard to do anything useful with it.

The other possible approach is to allow extensions to feed some
information into the planner before path generation and let that
influence which paths are generated. This is essentially what
pg_hint_plan is doing: it implements plan type hints by arranging to
flip the various enable_* GUCs on and off during the planning of
various rels. That's clever but ugly, and it ends up duplicating
substantial chunks of planner code due to the inadequacy of the
existing hooks. With some refactoring and some additional hooks, we
could make this much less ugly. But that gets at what I believe to be
the core difficulty of this approach, which is that the core planner
code needs to be somewhat aware of and on board with what the user or
the extension is trying to do. If an extension wants to force the join
order, that is likely to require different scaffolding than if it
wants to force the join methods which is again different from if a
hook wants to bias the query planner towards or against particular
indexes. Putting in hooks or other infrastructure that allows an
extension to control a particular aspect of planner behavior is to
some extent an endorsement of controlling the planner behavior in that
particular way. Since any amount of allowing the user to control the
planner tends to be controversial around here, that opens up the
spectre of putting a whole lot of effort into arguing about which
things extensions should be allowed to do, getting most of the patches
rejected, and ending up with nothing that's actually useful.

But on the other hand, it's not like we have to design everything in a
greenfield. Other database systems have provided in-core, user-facing
features to control the planner for decades, and we can look at those
offerings -- and existing offerings in the PG space -- as we try to
judge whether a particular use case is totally insane. I am not here
to argue that everything that every system has done is completely
perfect and without notable flaws, but our own system has its own
share of flaws, and the fact that you can do very little when a
previously unproblematic query starts suddenly producing a bad plan is
definitely one of them. I believe we are long past the point where we
can simply hold our breath and pretend like there's no issue here. At
the very least, allowing extensions to control scan methods (including
choice of indexes), join methods, and join order (including which
table ends up on which side of a given join) and similar things for
aggregates and appendrels seems to me like it ought to be table
stakes. And those extensions shouldn't have to duplicate large chunks
of code or resort to gross hacks to do it. Eventually, maybe we'll
even want to have directly user-facing features to do some of this
stuff (in query hints, out of query hints, or whatever) but I think
opening the door up to extensions doing it is a good first step,
because (1) that allows different extensions to do different things
without taking a position on what the One Right Thing To Do is and (2)
if it becomes clear that something improvident has been done, it is a
lot easier to back out a hook or some C API change than it is to
back-out a user-visible feature. Or maybe we'll never want to expose a
user-visible feature here, but it can still be useful to enable
extensions.

The attached patch, briefly mentioned above, essentially converts the
enable_* GUCs into RelOptInfo properties where the defaults are set by
the corresponding GUCs. The idea is that a hook could then change this
on a per-RelOptInfo basis before path generation happens. For
baserels, I believe that could be done from get_relation_info_hook for
baserels, and we could introduce something similar for other kinds of
rels. I don't think this is in any way the perfect approach. On the
one hand, it doesn't give you all the kinds of control over path
generation that you might want. On the other hand, the more I look at
what our enable_* GUCs actually do, the less impressed I am. IMHO,
things like enable_hashjoin make a lot of sense, but enable_sort seems
like it just controls an absolutely random smattering of behaviors in
a way that seems to me to have very little to recommend it, and I've
complained elsewhere about how enable_indexscan and
enable_indexonlyscan are really quite odd when you look at how they're
implemented. Still, this seemed like a fairly easy thing to do as a
way of demonstrating the kind of thing that we could do to provide
extensions with more control over planner behavior, and I believe it
would be concretely useful to pg_hint_plan in particular. But all that
said, as much as anything, I want to get some feedback on what
approaches and trade-offs people think might be acceptable here,
because there's not much point in me spending a bunch of time writing
code that everyone (or a critical mass of people) are going to hate.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

[1] https://github.com/ossc-db/pg_hint_plan
[2] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Optimize.html

Attachment Content-Type Size
v1-0001-Convert-enable_-GUCs-into-per-RelOptInfo-values-w.patch application/octet-stream 60.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ayush Vatsa 2024-08-26 16:34:36 Re: Proposal to have INCLUDE/EXCLUDE options for altering option values
Previous Message Tomas Vondra 2024-08-26 16:06:04 PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)