Re: Removing unneeded self joins

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: Removing unneeded self joins
Date: 2025-02-12 23:00:41
Message-ID: CAPpHfdsFcde5VWCxSup7x_12ig6uynrFaOms62-M_c8DM-FTBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Tue, Feb 11, 2025 at 5:31 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> Hi! Thank you for the work with this subject, I think it is really
> important.
>
> On 10.02.2025 22:58, Alexander Korotkov wrote:
> > Hi!
> >
> > On Mon, Feb 10, 2025 at 7:19 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> >> On 9/2/2025 18:41, Alexander Korotkov wrote:
> >>> Regarding adjust_relid_set() and replace_relid(). I think they are
> >>> now strictly equivalent, except for the case then old relid is given
> >>> and not found. In this case adjust_relid_set() returns the original
> >>> relids while replace_relid() returns a copy. The behavior of
> >>> adjust_relid_set() appears more desirable as we don't need extra
> >>> copying when no modification is done. So, I've replaced all
> >>> replace_relid() with adjust_relid_set().
> >> Ok, I glanced into it, and it makes sense to merge these routines.
> >> I think the comment to adjust_relid_set() should be arranged, too. See
> >> the attachment for a short variant of such modification.
> >>> Also, I did some grammar correction to your new comment in tests.
> >> Thanks!
> > I've further revised adjust_relid_set() header comment.
> >
> > Looking back to the work done since previous attempt to commit this to
> > pg17, I can highlight following.
> > 1) We're now using more of existing infrastructure including
> > adjust_relid_set() and ChangeVarNodes(). The most of complexity is
> > still there though.
> > 2) We've checked few ways to further simplify this patch. But yet the
> > current way still feels to be best possible.
> > 3) For sure, several bugs were fixed.
> >
> > I think we could give it another chance for pg18 after some further
> > polishing (at least commit message still needs to be revised). Any
> > thoughts on this? Tom?
> >
> I didn't find any mistakes, I just have a refactoring remark. I think
> the part where we add non-redundant expressions with the
> binfo_candidates, jinfo_candidates
> check can be moved to a separate function, otherwise the code is very
> repetitive in this place. I did it and attached diff file

Thank you. I've integrated that into a patch. However, I've to
change keep_rinfo_list to be passed by pointer to
add_non_redundant_clauses(), because it might be changed in
distribute_restrictinfo_to_rels(). Without that there is a case of
duplicated clause in regression tests.

I've changed 'inner' and 'outer' vise versa in
remove_self_joins_one_group() for better readability (I believe that
was discussed upthread but lost). Also, I did a round of improvement
for comments and commit message.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v13-0001-Implement-Self-Join-Elimination.patch application/octet-stream 131.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2025-02-12 23:10:53 Re: describe special values in GUC descriptions more consistently
Previous Message Devulapalli, Raghuveer 2025-02-12 22:46:59 RE: Improve CRC32C performance on SSE4.2