Re: On disable_cost

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: On disable_cost
Date: 2024-05-06 12:27:38
Message-ID: CA+TgmoZ7LOTtQgWmEgSUSnVkBiiOyJ5SMQAqCvpJsPr7baTfzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 4, 2024 at 9:16 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I don't think you'd need to wait longer than where we do set_cheapest
> and find no paths to find out that there's going to be a problem.

I'm confused by this response, because I thought that the main point
of my previous email was explaining why that's not true. I showed an
example where you do find paths at set_cheapest() time and yet are
unable to complete planning.

> I don't think redoing planning is going to be easy or even useful. I
> mean what do you change when you replan? You can't just do
> enable_seqscan and enable_nestloop as if there's no index to provide
> sorted input and the plan requires some sort, then you still can't
> produce a plan. Adding enable_sort to the list does not give me much
> confidence we'll never fail to produce a plan either. It just seems
> impossible to know which of the disabled ones caused the RelOptInfo to
> have no paths. Also, you might end up enabling one that caused the
> planner to do something different than it would do today. For
> example, a Path that today incurs 2x disable_cost vs a Path that only
> receives 1x disable_cost might do something different if you just went
> and enabled a bunch of enable* GUCs before replanning.

I agree that there are problems here, both in terms of implementation
complexity and also in terms of what behavior you actually get, but I
do not think that a proposal which changes some current behavior
should be considered dead on arrival. Whatever new behavior we might
want to implement needs to make sense, and there need to be good
reasons for making whatever changes are contemplated, but I don't
think we should take the position that it has to be identical to
current.

> I think the int Path.disabledness idea is worth coding up to try it.
> I imagine that a Path will incur the maximum of its subpath's
> disabledness's then add_path() just needs to prefer lower-valued
> disabledness Paths.

It definitely needs to be sum, not max. Otherwise you can't get the
matest example from the regression tests right, where one child lacks
the ability to comply with the GUC setting.

> That doesn't get you the benefits of fewer CPU cycles, but where did
> that come from as a motive to change this? There's no shortage of
> other ways to make the planner faster if that's an issue.

Well, I don't agree with that at all. If there are lots of ways to
make the planner faster, we should definitely do a bunch of that
stuff, because "will slow down the planner too much" has been a
leading cause of proposed planner patches being rejected for as long
as I've been involved with the project. My belief was that we were
rather short of good ideas in that area, actually. But even if it's
true that we have lots of other ways to speed up the planner, that
doesn't mean that it wouldn't be good to do it here, too.

Stepping back a bit, my current view of this area is: disable_cost is
highly imperfect both as an idea and as implemented in PostgreSQL.
Although I'm discovering that the current implementation gets more
things right than I had realized, it also sometimes gets things wrong.
The original poster gave an example of that, and there are others.
Furthermore, the current implementation has some weird
inconsistencies. Therefore, I would like something better. Better, to
me, could mean any combination of (a) superior behavior, (b) superior
performance, and (c) simpler, more elegant code. In a perfect world,
we'd be able to come up with something that wins in all three of those
areas, but I'm not seeing a way to achieve that, so I'm trying to
figure out what is achievable. And because we need to reach consensus
on whatever is to be done, I'm sharing raw research results rather
than just dropping a completed patch. I don't think it's at all easy
to understand what the realistic possibilities are in this area;
certainly it isn't for me. At some point I'm hoping that there will be
a patch (or a bunch of patches) that we can all agree are an
improvement over now and the best we can reasonably do, but I don't
yet know what the shape of those will be, because I'm still trying to
understand (and document on-list) what all the problems are.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2024-05-06 12:34:22 Possible to include xid8 in logical replication
Previous Message Jakub Wartak 2024-05-06 10:55:27 Re: apply_scanjoin_target_to_paths and partitionwise join