From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Richard Guo <guofenglinux(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: A problem about partitionwise join |
Date: | 2024-03-19 07:39:53 |
Message-ID: | CAExHW5tmaAj=SHZcURhsiAbQbztf_XMHgAu5_hz-cp-GebJ6qQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 19, 2024 at 8:18 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> (Sorry it takes me some time to get back to this thread.)
>
> On Thu, Mar 7, 2024 at 7:13 PM Ashutosh Bapat <
> ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
>> I did a deeper review of the patch. Here are some comments
>>
>
> Thank you for the review!
>
>
>> Approach
>> --------
>> The equijoin condition between partition keys doesn't appear in the
>> join's restrictilist because of 'best_score' strategy as you explained well
>> in [2]. What if we add an extra score for clauses between partition keys
>> and give preference to equijoin between partition keys? Have you given it a
>> thought? I feel that having an equijoin clause involving partition keys has
>> more usages compared to a clause with any random column. E.g. nextloop may
>> be able to prune partitions from inner relation if the clause contains a
>> partition key.
>>
>
> Hmm, I think this approach won't work in cases where one certain pair of
> partition keys has formed an EC that contains pseudoconstants. In such
> cases, the EC machinery will generate restriction clauses like 'pk =
> const' rather than any join clauses.
>
That should be ok and more desirable. Clauses like pk = const will leave
only one partition around in each of the joining relations thus PWJ won't
be required OR it will be automatic - whichever way you see it.
>
> Besides, it seems to me that it's not a cheap operation to check whether
> a join clause is between partition keys when we generate join clauses
> from ECs in generate_join_implied_equalities().
>
Why? The code would be the same as what we have
in have_partkey_equi_join().
>
>
> Documentation
>> -------------
>> The patch does not modify any documentation. The only documentation I
>> could find about partitionwise join is the one for GUC
>> 'enable_partitionwise_join'. It says
>> --- quote
>> "Partitionwise join currently applies only when the join conditions
>> include all the partition keys, which must be of the same data type and
>> have one-to-one matching sets of child partitions.".
>> --- unquote
>> This sentence is general and IMO covers the case this patch considers.
>> But in general I feel that partitionwise join and aggregation deserve
>> separate sections next to "partition pruning" in [1]; It should mention
>> advanced partition matching algorithm as well. Would you be willing to
>> write one and then expand it for the case in the patch?
>>
>
> I don't think it should be part of this patch to add a new section in
> the docs to explain partitionwise join and aggregation. Maybe that
> deserves a separate patch?
>
Yes.
3. The testcase added looks artificial. it outputs data that has same value
>> for two columns which is equal to the primary key of the other table - when
>> would somebody do that?. Is there some real life example where this change
>> will be useful?
>>
>
> Hmm, I think the test case is good as long as it reveals the issue that
> this patch fixes. It follows the same format as the existing test case
> just above it. I'm not sure if there are real life examples, but I
> think it may not always be necessary to derive test cases from them.
>
Let's defer this to the committer.
>
>
>> Code
>> ----
>> Minor comment for now. It will be better to increment num_equal_pks
>> immediately after setting pk_known_equal[ipk] = true. Otherwise the code
>> gets confusing around line 2269. I will spend more time reviewing the code
>> next week.
>>
>
> Hmm, the increment of num_equal_pks on line 2272 is parallel to the one
> in the first loop (around line 2200). Maybe it's better to keep them
> consistent as the current patch does?
>
>
In the first loop, setting pk_known_equal[ipk1] = true and ++num_equal_pks
happens on consecutive lines. That's not true in the second loop, where
there are at least some code line where num_equal_pks is inconsistent with
the number of "true" entries in pk_known_equal. We should avoid that.
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-03-19 08:28:37 | What is a typical precision of gettimeofday()? |
Previous Message | Bertrand Drouvot | 2024-03-19 07:34:09 | Re: Autogenerate some wait events code and documentation |