Re: On disable_cost

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: On disable_cost
Date: 2024-05-04 16:57:32
Message-ID: 2930629.1714841852@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> I don't think you'd need to wait longer than where we do set_cheapest
> and find no paths to find out that there's going to be a problem.

At a base relation, yes, but that doesn't work for joins: it may be
that a particular join cannot be formed, yet other join sequences
will work. We have that all the time from outer-join ordering
restrictions, never mind enable_xxxjoin flags. So I'm not sure
that we can usefully declare early failure for joins.

> I think the int Path.disabledness idea is worth coding up to try it.
> I imagine that a Path will incur the maximum of its subpath's
> disabledness's then add_path() just needs to prefer lower-valued
> disabledness Paths.

I would think sum not maximum, but that's a detail.

> That doesn't get you the benefits of fewer CPU cycles, but where did
> that come from as a motive to change this? There's no shortage of
> other ways to make the planner faster if that's an issue.

The concern was to not *add* CPU cycles in order to make this area
better. But I do tend to agree that we've exhausted all the other
options.

BTW, I looked through costsize.c just now to see exactly what we are
using disable_cost for, and it seemed like a majority of the cases are
just wrong. Where possible, we should implement a plan-type-disable
flag by not generating the associated Path in the first place, not by
applying disable_cost to it. But it looks like a lot of people have
erroneously copied the wrong logic. I would say that only these plan
types should use the disable_cost method:

seqscan
nestloop join
sort

as those are the only ones where we risk not being able to make a
plan at all for lack of other alternatives.

There is also some weirdness around needing to force use of tidscan
if we have WHERE CURRENT OF. But perhaps a different hack could be
used for that.

We also have this for hashjoin:

* If the bucket holding the inner MCV would exceed hash_mem, we don't
* want to hash unless there is really no other alternative, so apply
* disable_cost.

I'm content to leave that be, if we can't remove disable_cost
entirely.

What I'm wondering at this point is whether we need to trouble with
implementing the separate-disabledness-count method, if we trim back
the number of places using disable_cost to the absolute minimum.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Koshakow 2024-05-04 19:38:14 Re: drop column name conflict
Previous Message Sriram RK 2024-05-04 16:01:06 Re: AIX support