Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Date: 2024-11-01 02:39:35
Message-ID: CAHewXNk0U+FJAXT1zm6LtZrojYCmnhc_0=3DjvJ3LvZTsjsKAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Langote <amitlangote09(at)gmail(dot)com> 于2024年10月31日周四 21:09写道:

> On Wed, Oct 30, 2024 at 9:36 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > On Wed, Oct 30, 2024 at 11:58 AM jian he <jian(dot)universality(at)gmail(dot)com>
> wrote:
> > >
> > > I missed a case when column collation and partition key collation are
> > > the same and indeterministic.
> > > that should be fine for partition-wise join.
> > > so v2 attached.
> > >
> > > have_partkey_equi_join, match_expr_to_partition_keys didn't do any
> > > collation related check.
> > > propose v2 change disallow partitionwise join for case when
> > > column collation is indeterministic *and* is differ from partition
> > > key's collation.
> > >
> > > the attached partition_wise_join_collation.sql is the test script.
> > > you may use it to compare with the master behavior.
> >
> > What if the partkey collation and column collation are both
> deterministic,
> > but with different sort order?
> >
> > I'm not familiar with this part of the code base, but it seems to me the
> > partition wise join should use partkey collation instead of column
> collation,
> > because it's the partkey collation that decides which partition a row to
> > be dispatched.
> >
> > What Jian proposed is also reasonable but seems another aspect of
> $subject?
>
> I think we should insist that the join key collation and the partition
> collation are exactly the same and refuse to match them if they are
> not.
>
> + {
> + Oid colloid = exprCollation((Node *) expr);
> +
> + if ((partcoll != colloid) &&
> + OidIsValid(colloid) &&
> + !get_collation_isdeterministic(colloid))
> + *coll_incompatiable = true;
>
> I am not quite sure what is the point of checking whether or not the
> expression collation is deterministic after confirming that it's not
> the same as partcoll.
>

Me, too.

>
> Attached 0002 is what I came up with. One thing that's different from
> what Jian proposed is that match_expr_to_partition_keys() returns -1
>

Agree.

(expr not matched to any key) when the collation is also not matched
> instead of using a separate output parameter for that.
>

In have_partkey_equi_join()
...
if (exprs_known_equal(root, expr1, expr2, btree_opfamily))
{
Oid partcoll2 = rel1->part_scheme->partcollation[ipk];
....

I think we should use rel2 here, not rel1.

--
Thanks,
Tender Wang

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tender Wang 2024-11-01 02:59:21 Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Previous Message Richard Guo 2024-11-01 01:50:24 Re: Eager aggregation, take 3