Re: On disable_cost

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: On disable_cost
Date: 2024-10-03 21:52:33
Message-ID: CAApHDvoND8j8Pg6OuhqhDkaOmw5cEfQUUN0xf5ErxX=br+hz_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 4 Oct 2024 at 02:15, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> robert.haas=# explain with recursive foo as (select count(*) from
> pgbench_accounts union all select aid from pgbench_accounts a, foo
> where aid > foo.count) select * from pgbench_accounts, foo where aid =
> foo.count;

> You might expect that the number of disabled nodes for the hash join
> would include the number for the CTE attached to it, but it doesn't. I
> suspect similar things will happen when a node has an InitPlan or
> SubPlan node attached to it. (I have not tested whether your patch
> gets these cases right.)

It looks fine with the patch. The crux of the new logic is just
summing up the disabled_nodes from the child nodes and checking if the
disabled_nodes of the current node is higher than that sum. That's not
exactly hard logic. The biggest risk seems to be not correctly
visiting all the child nodes. I got that wrong with SubqueryScan in my
first PoC.

> For sure. To be clear, it's not that I love the current output. It's
> that I'm worried that it will be hard to get the thing that you and
> Laurenz want to be fully reliable, and I think there's a chance that
> not only might it contain bugs now, but it might turn out that people
> changing logic in this area in the future introduce more bugs.
> plan_is_disabled() has to get exactly the correct answer for the child
> nodes every time, or the answer is wrong, and I'm not as confident as
> you are that your logic is fully correct (which doesn't mean that I
> can prove to you that it is incorrect; I don't even know that it is).

One thing the patch did cause me to find is the missing propagation of
disabled_nodes in make_sort(). It was very obviously wrong with the
patched EXPLAIN output and wasn't so obvious with the current output,
so perhaps you could look at this patch as a better way of ensuring
the disable_node propagation is correct. That's much harder logic to
get right than what I've added to explain.c as it's spread out in many
places.

Take this case, for example:

create table lp (a int) partition by list(a);
create table lp1 partition of lp for values in(1);
create table lp2 partition of lp for values in(2);
create index on lp1(a);
insert into lp select 1 from generate_Series(1,1000000);
analyze lp;
set enable_sort=0;
explain (costs off) select * from lp order by a;

master gives:

Append
Disabled Nodes: 1
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

which isn't correct. Append appears disabled, but it's not. Sort is.
Before I fixed that in the patch, I was incorrectly getting the
"Disabled: true" under the Append. I feel we're more likely to get bug
reports alerting us to incorrect logic when the disabled property only
appears on disabled nodes as there are far fewer of them to look at
and therefore it's more obvious when they're misplaced.

The patched version correctly gives us:

Append
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Disabled: true
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-10-03 22:17:01 Re: Should rolpassword be toastable?
Previous Message Tom Lane 2024-10-03 21:39:06 Re: Should rolpassword be toastable?