From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> |
Subject: | Re: apply_scanjoin_target_to_paths and partitionwise join |
Date: | 2024-04-15 06:59:47 |
Message-ID: | CAExHW5sWk2Zpzf_n6kpGFJY9tH+Rpt5CSWcktzOU_hVANm0oaA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Here's patch with
On Thu, Apr 11, 2024 at 12:24 PM Ashutosh Bapat <
ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
>
> On Thu, Apr 11, 2024 at 12:07 PM Ashutosh Bapat <
> ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
>> Hi All,
>> Per below code and comment in apply_scanjoin_target_to_paths(), the
>> function zaps all the paths of a partitioned relation.
>> /*
>> * If the rel is partitioned, we want to drop its existing paths and
>> * generate new ones. This function would still be correct if we kept the
>> * existing paths: we'd modify them to generate the correct target above
>> * the partitioning Append, and then they'd compete on cost with paths
>> * generating the target below the Append
>> ... snip ...
>> */
>> if (rel_is_partitioned)
>> rel->pathlist = NIL;
>>
>> Later the function adjusts the targets of paths in child relations and
>> constructs Append paths from them. That works for simple partitioned
>> relations but not for join between partitioned relations. When
>> enable_partitionwise_join is true, the joinrel representing a join between
>> partitioned relations may have join paths joining append paths and Append
>> paths containing child join paths. Once we zap the pathlist, the only paths
>> that can be computed again are the Append paths. If the optimal path,
>> before applying the new target, was a join of append paths it will be lost
>> forever. This will result in choosing a suboptimal Append path.
>>
>> We have one such query in our regression set.
>>
>> SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN
>> plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE
>> coalesce(t1.a, 0 ) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY
>> t1.c, t1.a, t2.a, t3.a;
>>
>> For this query, the cheapest Append of Joins path has cost 24.97..25.57
>> and the cheapest Join of Appends path has cost 21.29..21.81. The latter
>> should be chosen even though enable_partitionwise_join is ON. But this
>> function chooses the first.
>>
>> The solution is to zap the pathlists only for simple partitioned
>> relations like the attached patch.
>>
>> With this patch above query does not choose non-partitionwise join path
>> and partition_join test fails. That's expected. But we need to replace that
>> query with some query which uses partitionwise join while maintaining the
>> conditions of the test as explained in the comment above that query. I have
>> tried a few variations but without success. Suggestions welcome.
>>
>
Found a replacement for that query by using a 2-way join instead of 3-way
join. The query still executes the referenced code in
process_outer_partition() as mentioned in the comment. I did think about
removing the original query. But it is the only example in our regression
tests where partitionwise join is more costly than non-partitionwise join.
So I have left it as is in the test. I am fine if others think that we
should remove it.
Adding to the next commitfest but better to consider this for the next set
of minor releases.
--
Best Wishes,
Ashutosh Bapat
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-partitioned-join-case-in-apply_scanjoin-20240415.patch | text/x-patch | 8.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-04-15 07:43:08 | Re: Catalog domain not-null constraints |
Previous Message | Tom Lane | 2024-04-15 06:31:59 | Re: Why is parula failing? |