Re: Removing unneeded self joins

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2024-07-12 10:30:35
Message-ID: CAPpHfdvTHuhB5+Eohb-z7zWM6u9Y=W6vb8MMOqhfKeRkhVZGxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Andrei!

On Fri, Jul 12, 2024 at 6:05 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>
> On 7/11/24 14:43, jian he wrote:
> > On Tue, Jul 9, 2024 at 2:06 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> >>
> >> On 7/2/24 07:25, jian he wrote:
> >>> to make sure it's correct, I have added a lot of tests,
> >>> Some of this may be contrived, maybe some of the tests are redundant.
> >> Thanks for your job!
> >> I passed through the patches and have some notes:
> >> 1. Patch 0001 has not been applied anymore since the previous week's
> >> changes in the core. Also, there is one place with trailing whitespace.
> >
> > thanks.
> > because the previous thread mentioned the EPQ problem.
> > in remove_useless_self_joins, i make it can only process CMD_SELECT query.
> I would like to oppose here: IMO, it is just a mishap which we made
> because of a long history of patch transformations. There we lost the
> case where RowMark exists for only one of candidate relations.
> Also, after review I think we don't need so many new tests. Specifically
> for DML we already have one:
>
> EXPLAIN (COSTS OFF)
> UPDATE sj sq SET b = 1 FROM sj as sz WHERE sq.a = sz.a;
>
> And we should just add something to elaborate it a bit.
> See the patch in attachment containing my proposal to improve v4-0001
> main SJE patch. I think it resolved the issue with EPQ assertion as well
> as problems with returning value.

I tried this. I applied 0001 from [1] and 0002 from [2]. Then I
tried the concurrent test case [3]. It still fails with assert for
me. But assert and related stuff is the least problem. The big
problem, as described in [3], is semantical change in query. When EPQ
is applied, we fetch the latest tuple of the target relation
regardless snapshot. But for the self-joined relation we should still
use the snapshot-satisfying tuple. I don't see even attempt to
address this in your patch. And as I pointed before, this appears
quite complex.

Links.
1. https://www.postgresql.org/message-id/96250a42-20e3-40f0-9d45-f53ae852f8ed%40gmail.com
2. https://www.postgresql.org/message-id/5b49501c-9cb3-4c5d-9d56-49704ff08143%40gmail.com
3. https://www.postgresql.org/message-id/CAPpHfduM6X82ExT0r9UzFLJ12wOYPvRw5vT2Htq0gAPBgHhKeQ%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Pyhalov 2024-07-12 11:39:13 Partition-wise join with whole row vars
Previous Message Bertrand Drouvot 2024-07-12 10:26:32 Re: Flush pgstats file during checkpoints