Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Date: 2024-11-01 07:25:28
Message-ID: CA+HiwqEQHVbAdFWNcrnDEZXk=m-Mk5ZtCoFq_tKkyoHPRATuag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Nov 1, 2024 at 2:39 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> On Thu, Oct 31, 2024 at 9:09 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> >
> >
> > I think we should insist that the join key collation and the partition
> > collation are exactly the same and refuse to match them if they are
> > not.
> >
> > + {
> > + Oid colloid = exprCollation((Node *) expr);
> > +
> > + if ((partcoll != colloid) &&
> > + OidIsValid(colloid) &&
> > + !get_collation_isdeterministic(colloid))
> > + *coll_incompatiable = true;
> >
> > I am not quite sure what is the point of checking whether or not the
> > expression collation is deterministic after confirming that it's not
> > the same as partcoll.
> >
> > Attached 0002 is what I came up with. One thing that's different from
> > what Jian proposed is that match_expr_to_partition_keys() returns -1
> > (expr not matched to any key) when the collation is also not matched
> > instead of using a separate output parameter for that.
> >
> i was thinking that
> CREATE TABLE part_tab (c text collate "POSIX") PARTITION BY LIST(c collate "C");
> maybe can do partitionwise join.
> join key collation and the partition key collation same sure would
> make things easy.

I think that's maybe ok to do as a new feature (use partitionwise join
even if collations differ but are both deterministic?), but we should
take a more restrictive approach in a bug fix that is to be
back-patched.

> about 0002.
> Similar to PartCollMatchesExprColl in match_clause_to_partition_key
> I think we can simply do the following:
> no need to hack match_expr_to_partition_keys.
>
> @@ -2181,6 +2181,9 @@ have_partkey_equi_join(PlannerInfo *root,
> RelOptInfo *joinrel,
> if (ipk1 != ipk2)
> continue;
>
> + if (rel1->part_scheme->partcollation[ipk1] !=
> opexpr->inputcollid)
> + return false;

Yes, looks like that should be enough, thanks.

I've updated the patch. I've added another test case to test the new
collation matching code in the code block of have_partkey_equi_join()
that pairs partition keys via equivalence class.

Adding Ashutosh to cc, as the original author of this code, to get his
thoughts on these fixes.

--
Thanks, Amit Langote

Attachment Content-Type Size
v4-0002-Disallow-partitionwise-join-when-collation-doesn-.patch application/octet-stream 12.2 KB
v4-0001-Disallow-partitionwise-grouping-when-collation-do.patch application/octet-stream 12.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-11-01 07:26:45 Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Previous Message David Rowley 2024-11-01 07:21:50 Re: define pg_structiszero(addr, s, r)