Re: A problem about partitionwise join

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Guo <riguo(at)pivotal(dot)io>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A problem about partitionwise join
Date: 2020-04-08 04:58:49
Message-ID: CAMbWs48EPxJeErxDcV-gSTYeSwsd1ri37Q_ZNxt5NSWEpP1NiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 5, 2020 at 4:38 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Richard Guo <riguo(at)pivotal(dot)io> writes:
> > Rebased the patch with latest master and also addressed the test case
> > failure reported by PostgreSQL Patch Tester.
>
> I looked this patch over, but I don't like it too much: it seems very
> brute-force (and badly under-commented). Creating all those extra
> RestrictInfos isn't too cheap in itself, plus they'll jam up the
> equivalence-class machinery for future tests.
>

Thanks for the review.

>
> There is already something in equivclass.c that would almost do what
> we want here: exprs_known_equal() would tell us whether the partkeys
> can be found in the same eclass, without having to generate data
> structures along the way. The current implementation is not watertight
> because it doesn't check opclass semantics, but that consideration
> can be bolted on readily enough. So that leads me to something like
> the attached.
>

I looked through this patch and it's much more elegant than the previous
one. Thank you for working on it.

For partkeys which fail to be identified as equal by looking through
restrictlist, it's a good idea to check them in ECs with the help of
exprs_known_equal().

I have some concern about we only check non-nullable partexprs. Is it
possible that two nullable partexprs come from the same EC? I tried to
give an example but failed.

>
> One argument that could be made against this approach is that if there
> are a lot of partkey expressions, this requires O(N^2) calls to
> exprs_known_equal, something that's already not too cheap. I think
> that that's not a big problem because the number of partkey expressions
> would only be equal to the join degree (ie it doesn't scale with the
> number of partitions of the baserels) ... but maybe I'm wrong about
> that?

You are right. According to how partexpr is formed for joinrel in
set_joinrel_partition_key_exprs(), each base relation within the join
contributes one partexpr, so the number of partexprs would be equal to
the join degree.

> I also wonder if it's really necessary to check every pair
> of partkey expressions. It seems at least plausible that in the
> cases we care about, all the partkeys on each side would be in the same
> eclasses anyway, so that comparing the first members of each list would
> be sufficient. But I haven't beat on that point.
>

Not sure about it. But cannot come out with a counterexample.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-04-08 05:09:18 Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Previous Message Andrew Gierth 2020-04-08 04:29:24 Re: FETCH FIRST clause WITH TIES option