From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different. |
Date: | 2024-10-31 13:08:58 |
Message-ID: | CA+HiwqEHx++qTjCGmWyYZDiPR=_fLGPjR7RcFEVnSNZVc-Qj2g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 30, 2024 at 9:36 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> On Wed, Oct 30, 2024 at 11:58 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> >
> > I missed a case when column collation and partition key collation are
> > the same and indeterministic.
> > that should be fine for partition-wise join.
> > so v2 attached.
> >
> > have_partkey_equi_join, match_expr_to_partition_keys didn't do any
> > collation related check.
> > propose v2 change disallow partitionwise join for case when
> > column collation is indeterministic *and* is differ from partition
> > key's collation.
> >
> > the attached partition_wise_join_collation.sql is the test script.
> > you may use it to compare with the master behavior.
>
> What if the partkey collation and column collation are both deterministic,
> but with different sort order?
>
> I'm not familiar with this part of the code base, but it seems to me the
> partition wise join should use partkey collation instead of column collation,
> because it's the partkey collation that decides which partition a row to
> be dispatched.
>
> What Jian proposed is also reasonable but seems another aspect of $subject?
I think we should insist that the join key collation and the partition
collation are exactly the same and refuse to match them if they are
not.
+ {
+ Oid colloid = exprCollation((Node *) expr);
+
+ if ((partcoll != colloid) &&
+ OidIsValid(colloid) &&
+ !get_collation_isdeterministic(colloid))
+ *coll_incompatiable = true;
I am not quite sure what is the point of checking whether or not the
expression collation is deterministic after confirming that it's not
the same as partcoll.
Attached 0002 is what I came up with. One thing that's different from
what Jian proposed is that match_expr_to_partition_keys() returns -1
(expr not matched to any key) when the collation is also not matched
instead of using a separate output parameter for that.
0001 is the patch for the partitionwise grouping issue (bug #18568).
I concluded that even partial aggregate should be disallowed when the
grouping collation doesn't match partitioning collation. The result
with partial partitionwise grouping is not the same as when
enable_partitionwise_aggregate is off.
--
Thanks, Amit Langote
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Disallow-partitionwise-grouping-when-collation-do.patch | application/octet-stream | 11.7 KB |
v3-0002-Disallow-partitionwise-join-when-collation-doesn-.patch | application/octet-stream | 8.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Karthik Ramanathan | 2024-10-31 13:20:54 | "command cannot affect row a second time" in INSERT ... ON CONFLICT |
Previous Message | Jesper Pedersen | 2024-10-31 12:59:01 | Re: protocol-level wait-for-LSN |