Re: Removing unneeded self joins

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2024-07-19 08:30:07
Message-ID: CAEZATCUeJ93gWCEzK1eRxyZakvv5PVG9LfF3PkSGMsNBC7h9kA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 17 Jul 2024 at 01:45, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> Jian He gave a try to ChangeVarNodes() [1]. That gives some
> improvement, but the vast majority of complexity is still here. I
> think the reason for complexity of SJE is that it's the first time we
> remove relation, which is actually *used* and therefore might has
> references in awful a lot of places. In previous cases we removed
> relations, which were actually unused.
>

I had a quick look at this, and I have a couple of comments on the
rewriter changes.

The new function replace_relid() looks to be the same as adjust_relid_set().

The changes to ChangeVarNodes() look a little messy. There's a lot of
code duplicated between ChangeVarNodesExtended() and ChangeVarNodes(),
which could be avoided by having one call the other. Also, it would be
better for ChangeVarNodesExtended() to have a "flags" parameter
instead of an extra boolean parameter, to make it more extensible in
the future. However,...

I question whether ChangeVarNodesExtended() and the changes to
ChangeVarNodes() are really the right way to go about this.
ChangeVarNodes() in particular gains a lot more logic to handle
RestrictInfo nodes that doesn't really feel like it belongs there --
e.g., building NullTest nodes is really specific to SJE, and doesn't
seem like it's something ChangeVarNodes() should be doing.

A better solution might be to add a new walker function to
analyzejoins.c that does just what SJE needs, which is different from
ChangeVarNodes() in a number of ways. For Var nodes, it might
ultimately be necessary to do more than just change the varno, to
solve the RETURNING/EPQ problems. For RestrictInfo nodes, there's a
lot of SJE-specific logic. The SJE code wants to ignore RangeTblRef
nodes, but it could delegate to ChangeVarNodes() for various other
node types to avoid code duplication. At the top level, the stuff that
ChangeVarNodes() does to fields of the Query struct would be different
for SJE, I think.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-07-19 08:36:34 Re: Conflict detection and logging in logical replication
Previous Message Nazir Bilal Yavuz 2024-07-19 08:07:26 Re: PG_TEST_EXTRA and meson