From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Removing unneeded self joins |
Date: | 2024-07-17 00:45:07 |
Message-ID: | CAPpHfduRB9oW3YRDpvPGcDSAbH4iu0LSRhxpRRAzBh7euC9amQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, Tom!
I'd like to give you and update on the progress with SJE.
On Mon, May 6, 2024 at 6:54 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I want to go on record right now as disagreeing with the plan proposed
> > in the commit message for the revert commit, namely, committing this
> > again early in the v18 cycle. I don't think Tom would have proposed
> > reverting this feature unless he believed that it had more serious
> > problems than could be easily fixed in a short period of time. I think
> > that concern is well-founded, given the number of fixes that were
> > committed. It seems likely that the patch needs significant rework and
> > stabilization before it gets committed again, and I think it shouldn't
> > be committed again without explicit agreement from Tom or one of the
> > other committers who have significant experience with the query
> > planner.
>
> FWIW I accept some of the blame here, for not having paid any
> attention to the SJE work earlier. I had other things on my mind
> for most of last year, and not enough bandwidth to help.
>
> The main thing I'd like to understand before we try this again is
> why SJE needed so much new query-tree-manipulation infrastructure.
> I would have expected it to be very similar to the left-join
> elimination we do already, and therefore to mostly just share the
> existing infrastructure. (I also harbor suspicions that some of
> the new code existed just because someone didn't research what
> was already there --- for instance, the now-removed replace_varno
> sure looks like ChangeVarNodes should have been used instead.)
Jian He gave a try to ChangeVarNodes() [1]. That gives some
improvement, but the vast majority of complexity is still here. I
think the reason for complexity of SJE is that it's the first time we
remove relation, which is actually *used* and therefore might has
references in awful a lot of places. In previous cases we removed
relations, which were actually unused.
There are actually alternative designs for this feature. I've
proposed "alias relids" before [2]. But it's not clear we will
resolve more problems than create, given that it could break awfully a
lot of assumptions during query planning. Andrei also proposed that
perl script could generate us a walker over planner structures [3].
Although this method might offer a structured approach, it seems like
overengineering for the problem at hand.
I believe it's worth giving the current approach another chance.
Vardan Pogosyan has conducted some tests, and I am in the process of
clarifying the details. We could enhance the approach by adding more
comments to ensure that any changes in the planner data structure are
flagged for potential revisions in the SJE code. What do you think?
> Another thing that made me pretty sad was 8c441c082 (Forbid SJE with
> result relation). While I don't claim that that destroyed the entire
> use case for SJE, it certainly knocked its usefulness down by many
> notches, maybe even to the point where it's not worth putting in the
> effort needed to get it to re-committability. So I think we need to
> look harder at finding a way around that. Is the concern that
> RETURNING should return either old or new values depending on which
> RTE is mentioned? If so, maybe the feature Dean has proposed to
> allow RETURNING to access old values [1] is a prerequisite to moving
> forward. Alternatively, perhaps it'd be good enough to forbid SJE
> only when the non-target relation is actually mentioned in RETURNING.
As Andrei pointed it's possible to apply SJE to result relation [4],
but where it's not a target relation. I guess the target relation
case is what you're most interested. In this case we hit problem of
joining relation having different row marks. In turn that triggers
EPQ problem [5] and probably more. In order to resolve that we need a
way to store multiple (at least two, but sure if more is needed)
tuples for relation. I still feel that we should postpone that,
because even basic SJE without target relation support is challenging.
There is probably a way to implement target relation support for PG18
after committing basic SJE. But that would require a lot of your
design work and guidance. I don't dare to design this kind of things.
Links.
1. https://www.postgresql.org/message-id/CACJufxHBLhOD1LerM643dgh%3DUZFGhPWfP1027D2x1W6DhF_BaQ%40mail.gmail.com
2. https://www.postgresql.org/message-id/CAPpHfdv6B8HCLdj8WidBryRrX0%2BX3F1rrR8uAuMQmp6rvPdscg%40mail.gmail.com
3. https://www.postgresql.org/message-id/96250a42-20e3-40f0-9d45-f53ae852f8ed%40gmail.com
4. https://www.postgresql.org/message-id/5b49501c-9cb3-4c5d-9d56-49704ff08143%40gmail.com
5. https://www.postgresql.org/message-id/flat/CAPpHfduM6X82ExT0r9UzFLJ12wOYPvRw5vT2Htq0gAPBgHhKeQ%40mail.gmail.com
------
Regards,
Alexander Korotkov
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Paul George | 2024-07-17 00:50:15 | Re: Wrong results with grouping sets |
Previous Message | Thomas Munro | 2024-07-17 00:05:23 | Re: [EXTERNAL] Re: Add non-blocking version of PQcancel |