Re: allowing extensions to control planner behavior

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: allowing extensions to control planner behavior
Date: 2024-08-26 19:28:53
Message-ID: CA+TgmoZ9PNTKBOeP2rAXoi15nTg+NpYUiSGRgpMhz0nY1wpUtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 26, 2024 at 1:37 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I'm somewhat expecting to be flamed to a well-done crisp for saying
> > this, but I think we need better ways for extensions to control the
> > behavior of PostgreSQL's query planner.
>
> Nah, I won't flame you for that, it's a reasonable thing to think
> about. However, the devil is in the details, and ...

Thank you. Not being flamed is one of my favorite things. :-)

> > The attached patch, briefly mentioned above, essentially converts the
> > enable_* GUCs into RelOptInfo properties where the defaults are set by
> > the corresponding GUCs.
>
> ... this doesn't seem like it's moving the football very far at all.
> The enable_XXX GUCs are certainly blunt instruments, but I'm not sure
> how much better it is if they're per-rel. For example, I don't see
> how this gets us any closer to letting an extension fix a poor choice
> of join order. Or, if your problem is that the planner wants to scan
> index A but you want it to scan index B, enable_indexscan won't help.

Well, I agree that this doesn't address everything you might want to
do, and I thought I said so, admittedly in the middle of a long wall
of text. This would JUST be a step toward letting an extension control
the scan and join methods, not the join order or the choice of index
or whatever else there is. But the fact that it doesn't do everything
is not a strike against it unless there's some competing design that
lets you take care of everything with a single mechanism, which I do
not see as realistic. If this proposal -- or really any proposal in
this area -- gets through, I will very happily propose more things to
address the other problems that I know about, but it doesn't make
sense to do a huge amount of work to craft a comprehensive solution
before we've had any discussion here.

> Yeah, these sorts of questions aren't made better this way either.
> If anything, having extensions manipulating these variables will
> make it even harder to rethink what they do.

Correct, but my proposal to make enable_indexscan behave like
enable_indexonlyscan, which I thought was a slam-dunk, just provoked a
lot of grumbling. There's a kind of chicken and egg problem here. If
the existing GUCs were better designed, then using them here would
make sense. And the patch that I attached to my previous email were in
master, then cleaning up the design of the GUCs would have more value.
But if I can't make any progress with either problem because the other
problem also exists, then I'm kind of boxed into a corner. I could
also propose something here that is diverges from the enable_*
behavior, but then people will complain that the two shouldn't be
inconsistent, which I agree with, BTW. I thought maybe doing this
first would make sense, and then we could refine afterwards.

> You mentioned that there is prior art out there, but this proposal
> doesn't seem like it's drawing on any such thing. What ideas should
> we be stealing?

Depends what you mean. As far as PostgreSQL-related things, the two
things that I mentioned in my opening paragraph and for which I
provided links seem to be me to the best examples we have. It's pretty
easy to see how to make pg_hint_plan require less kludgery, and I
think we can just iterate through the various problems there and solve
them pretty easily by adding a few hooks here and there and a few
extension-settable structure members here and there. I am engaging in
some serious hand-waving here, but this is not rocket science. I am
confident that if you made it your top priority to get into PG 18
stuff which would thoroughly un-hackify pg_hint_plan, you could be
done in months, possibly weeks. It will take me longer, but if we have
an agreement in principal that it is worth doing, I just can't see it
as being particularly difficult.

Amazon's query plan management stuff is a much tougher lift. For that,
you're asking the planner to try to create a new plan which is like
some old plan that you got before. So in a perfect world, you want to
control every planner decision. That's hard just because there are a
lot of them. If for example you want to get the same index scan that
you got before, you need not only to get the same type of index scan
(index, index-only, bitmap) and the same index, but also things like
the same non-native saop treatment, which seems like it would be
asking an awful lot of a hook system. On the other hand, maybe you can
cheat. If your regurgitate-the-same-plan system could force the same
join order, join methods, scan methods, choice of indexes, and
probably some stuff about aggregate and appendrel strategy, it might
be close enough to giving you the same plan you had before that nobody
would really care if the non-native saop treatment was different. I'm
almost positive it's better than not having a feature, which is where
are today. And although allowing control over just the major decisions
in query planning doesn't seem like something we can do in one patch,
I don't think it takes 100 patches either. Maybe five or ten.

If we step outside of the PostgreSQL ecosystem, I think we should look
at Oracle as one example. I have never been a real believer in hints
like SeqScan(foo), because if you don't fix the cardinality estimate
for table foo, then the rest of your plan is going to suck, too. On
the other hand, "hint everything" for some people in some situations
is a way to address that. It's stupid in a sense, but if you have an
automated way to do it, especially one that allows applying hints
out-of-query, it's not THAT stupid. Also, Oracle offers some other
pretty useful hints. In particular, using the LEADING hint to set the
driving table for the query plan does not seem dumb to me at all.
Hinting that things should be parallel or not, and with what degree of
parallelism, also seem quite reasonable. They've also got ALL_ROWS and
FIRST_ROWS(n) hints, which let you say whether you want fast-start
behavior or not, and it hardly needs to be said how often we get that
wrong or how badly. pg_hint_plan, which copies a lot of stuff that
Oracle does, innovates by allowing you to hint that a certain join
will return X number of rows or that the number or rows that the
planner thinks should be returned should be corrected by multiplying,
adding, or subtracting some constant. I'm not sure how useful this is
really because I feel like a lot of times you'd just pick some join
order where that particular join is no longer used e.g. if. A JOIN B
JOIN C and I hint the AB join, perhaps the planner will just start by
joining C to either A or B, and then that join will never occur.
However, that can be avoided by also using LEADING, or maybe in some
other cleverer way, like making an AB selectivity hint apply at
whatever point in the plan you join something that includes A to
something that includes B.

There's some details on SQL server's hinting here:
https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql?view=sql-server-ver16

It looks pretty complicated, but some of the basic concepts that you'd
expect are also present here: force the join method, rule in or out,
force the use of an index or of no index, force the join order. Those
seem to be the major things that "everyone" supports. I think we'd
want to expand a bit on that to allow forcing aggregate strategy and
perhaps some PostgreSQL-specific things e.g. other systems won't have
a hint to force a TIDRangeScan or not because that's a
PostgreSQL-specific concept, but it would be silly to make a system
that lets an extension control sequential scans and index scans but
not other, more rarely-used ways of scanning a relation, so probably
we want to do something.

I don't know if that helps, in terms of context. If it doesn't, let me
know what would help. And just to be clear, I *absolutely* think we
need to take a light touch here. If we install a ton of new
highly-opinionated infrastructure we will make a lot of people mad and
that's definitely not where I want to end up. I just think we need to
grow beyond "the planner is a black box and you shall not presume to
direct it." If every other system provides a way to control, say, the
join order, then it seems reasonable to suppose that a PostgreSQL
extension should be able to control the join order too. A lot of
details might be different but if multiple other systems have the
concept then the concept itself probably isn't ridiculous.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John H 2024-08-26 19:31:53 Re: Allow logical failover slots to wait on synchronous replication
Previous Message John H 2024-08-26 19:28:06 Re: Allow logical failover slots to wait on synchronous replication