From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Removing unneeded self joins |
Date: | 2024-10-28 21:18:27 |
Message-ID: | CAPpHfds=AapoZR23bU4YzT2Of_hOCr8ecPdOX6MOCK=TrAKMkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 8, 2024 at 6:54 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 7/24/24 18:07, Dean Rasheed wrote:
> > So you have to remember that "inner" is "outer" and "outer" is "inner"
> > when going into generate_join_implied_equalities() from
> > remove_self_joins_one_group(). And the same thing happens when calling
> > innerrel_is_unique_ext() and match_unique_clauses(). I think all that
> > could be resolved by swapping "inner" and "outer" in the variable
> > names and comments in remove_self_joins_one_group().
> As you can see, when you dive into the implementation of the
> generate_join_implied_equalities, an inner join works symmetrically. We
> don't need any other clauses here, except mentioning both inner and
> outer to prove self-join. So, it doesn't matter which side to choose as
> an inner - all the effect can be: instead of x1=x2, we will have clause
> x2=x1.
I think switching places of inner and outer still makes sense in terms
of readability. I've done so for v10. Also, added a comment
regarding symmetry of arguments for generate_join_implied_equalities()
call.
> > This presumes that any caller interested in knowing the extra
> > baserestrictinfo clauses used to prove uniqueness must be looking at a
> > self join. That may be true today, but it doesn't seem like a good API
> > design choice. I think it would be better to just add "self_join" as
> > an extra parameter, and also maybe have the function return the
> > UniqueRelInfo containing the "extra_clauses", or NULL if it's not
> > unique. That way, it would be more extensible, if we wanted it to
> > return more information in the future.
> If more users are interested in this function, we can just rename the
> flag self_join to something more universal - extra_prove, I'm not sure.
> >
> > Instead of adding relation_has_unique_index_ext(), maybe it would be
> > OK to just change the signature of relation_has_unique_index_for(). It
> > looks like it's only called from one other place outside
> > analyzejoins.c. Perhaps the same is true for innerrel_is_unique_ext().
> I vote against it remembering of extensions and broken APIs.
I spend some time using github search to find usage of
innerrel_is_unique() and relation_has_unique_index() in extensions.
Didn't find extensions actually use them. Some forks are using them
in the way PostgreSQL doesn't. But for forks it's way easier to adopt
such trivial API changes. If even there are some extensions using
this functions, it wouldn't be a terrible case of the broken API. The
current way seems to be rather bloating of our API. This is why I
decided to remove innerrel_is_unique_ext() and
relation_has_unique_index_ext(), and just add arguments to
innerrel_is_unique() and relation_has_unique_index_for().
I also did some rephrasing and grammar fixes for comments.
------
Regards,
Alexander Korotkov
Supabase
Attachment | Content-Type | Size |
---|---|---|
v10-0001-Remove-useless-self-joins.patch | application/octet-stream | 123.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-10-28 21:24:02 | Re: sunsetting md5 password support |
Previous Message | Michael Banck | 2024-10-28 21:10:58 | Re: trying again to get incremental backup |