Re: Removing unneeded self joins

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2024-07-09 06:06:10
Message-ID: 96250a42-20e3-40f0-9d45-f53ae852f8ed@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/2/24 07:25, jian he wrote:
> to make sure it's correct, I have added a lot of tests,
> Some of this may be contrived, maybe some of the tests are redundant.
Thanks for your job!
I passed through the patches and have some notes:
1. Patch 0001 has not been applied anymore since the previous week's
changes in the core. Also, there is one place with trailing whitespace.

Looking into the 0002 and 0003 patches, I think they 1) should be merged
and 2) It makes sense to use the already existing pull_varnos_of_level
routine instead of a new walker. See the patches in the attachment as a
sketch.
Also, I'm not sure about the tests. It looks like we have a lot of new
tests.

However, the main issue mentioned above is the correctness of relid
replacement in planner structures.
We have the machinery to check and replace relids in a Query. But
PlannerInfo is a bin for different stuff that the optimisation needs to
convert the parse tree to the corresponding cloud of paths.
A good demo of the problem is the introduction of the JoinDomain structure:
It contains a relids field and has no tests for that. We haven't known
for a long time about the issue of SJE not replacing the relid in this
structure.
The approach with 'Relation Alias' mentioned by Alexander raises many
hard questions about accessing simple_rel_array directly or, much worse,
about checking the scope of some clause where we didn't touch
RelOptInfo, just compare two relids fields.
The safest decision would be to restart query planning over parse tree
with removed self-joins, but now planner code isn't ready for that yet.
But maybe we should put this problem on the shoulders of a developer and
made something like with nodes: perl script which will generate walker
switch over PlannerInfo structure?

--
regards, Andrei Lepikhov

Attachment Content-Type Size
v4-0001-Remove-useless-self-joins.patch text/x-patch 121.0 KB
v4-0002-Apply-SJE-to-DML-queries-with-RETURNING-clause.patch text/x-patch 42.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-07-09 06:07:24 Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Previous Message Peter Smith 2024-07-09 06:01:33 Re: walsender.c comment with no context is hard to understand