Re: Removing unneeded self joins

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2024-05-03 03:19:58
Message-ID: 75de38d9-005d-4995-9ecf-a155844dcbcd@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/3/24 06:19, Tom Lane wrote:
> Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
>> I would appreciate your review of this patchset, and review from Andrei as well.
>
> I hate to say this ... but if we're still finding bugs this
> basic in SJE, isn't it time to give up on it for v17?
>
> I might feel better about it if there were any reason to think
> these were the last major bugs. But you have already committed
> around twenty separate fixes for the original SJE patch, and
> now here you come with several more; so it doesn't seem like
> the defect rate has slowed materially. There can be no doubt
> whatever that the original patch was far from commit-ready.
>
> I think we should revert SJE for v17 and do a thorough design
> review before trying again in v18.
I need to say I don't see any evidence of bad design.
I think this feature follows the example of 2489d76 [1], 1349d27 [2],
and partitionwise join features — we get some issues from time to time,
but these strengths and frequencies are significantly reduced.
First and foremost, this feature is highly isolated: like the PWJ
feature, you can just disable (not enable?) SJE and it guarantees you
will avoid the problems.
Secondly, this feature reflects the design decisions the optimiser has
made before. It raises some questions: Do we really control the
consistency of our paths and the plan tree? Maybe we hide our
misunderstanding of its logic by extensively copying expression trees,
sometimes without fundamental necessity. Perhaps the optimiser needs
some abstraction layers or reconstruction to reduce the quickly growing
complexity.
A good example here is [1]. IMO, the new promising feature it has
introduced isn't worth the complexity it added to the planner.
SJE, much like OR <-> ANY transformation, introduces a fresh perspective
into the planner: if we encounter a complex, redundant query, it may be
more beneficial to invest in simplifying the internal query
representation rather than adding new optimisations that will grapple
with this complexity.
Also, SJE raised questions I've never seen before, like: Could we
control the consistency of the PlannerInfo by changing something in the
logic?
Considering the current state, I don't see any concrete outcomes or
evidence that a redesign of the feature will lead us to a new path.
However, I believe the presence of SJE in the core could potentially
trigger improvements in the planner. As a result, I vote to stay with
the feature. But remember, as an author, I'm not entirely objective, so
let's wait for alternative opinions.

[1] Make Vars be outer-join-aware
[2] Improve performance of ORDER BY / DISTINCT aggregates

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-05-03 04:00:00 Re: cataloguing NOT NULL constraints
Previous Message Nathan Bossart 2024-05-03 02:51:40 improve performance of pg_dump with many sequences