Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Date: 2024-11-05 07:55:25
Message-ID: CA+HiwqHv+7gp+SPfE-=+KYmvrcVWryz+F5HNG2Rb6=UZL_A+OQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Nov 4, 2024 at 12:28 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> hi.
> about v5.
> if (exprs_known_equal(root, expr1, expr2, btree_opfamily))
> {
> /*
> * Ensure that the collation of the expression matches
> * that of the partition key. Checking just one collation
> * (partcoll1 and exprcoll1) suffices because partcoll1
> * and partcoll2, as well as exprcoll1 and exprcoll2,
> * should be identical. This holds because both rel1 and
> * rel2 use the same PartitionScheme and expr1 and expr2
> * are equal.
> */
> if (partcoll1 == exprcoll1)
> {
> Oid partcoll2 PG_USED_FOR_ASSERTS_ONLY =
> rel1->part_scheme->partcollation[ipk];
> Oid exprcoll2 PG_USED_FOR_ASSERTS_ONLY =
> exprCollation(expr2);
> Assert(partcoll2 == exprcoll2);
> pk_known_equal[ipk] = true;
> if (OidIsValid(exprcoll1))
> elog(INFO, "this path called %s:%d",
> __FILE_NAME__, __LINE__);
> break;
> }
> }
>
> tests still passed, which means that we didn't have text data type as
> partition key related tests for partition-wise join.
> Do we need to add one?
>
> +-- Another case where the partition keys are matched via equivalence class,
> +-- not a join restriction clause.
> +
> +-- OK when the join clause uses the same collation as the partition key
> +EXPLAIN (COSTS OFF)
> +SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab4 t2 ON t1.c
> = t2.c AND t1.c = t2.b COLLATE "C" GROUP BY 1 ORDER BY 1;
>
> i suppose, you comments is saying that in have_partkey_equi_join
> the above query will return true via
> `if (exprs_known_equal(root, expr1, expr2, btree_opfamily))`
> But " t1.c = t2.b COLLATE "C" already in "restrictlist".
> In have_partkey_equi_join loop through "restrictlist" would return
> true for above query, won't reach exprs_known_equal.
>
> Other than the comments that confused me, the test and the results
> look fine to me.

Thanks, yes, a test case that exercises the partcoll1 == exprcoll1
code was missing.

> some column collation is case_insensitive, ORDER BY that column would
> render the output not deterministic.
> like 'A' before 'a' and 'a' before 'A' are both correct.
> it may cause regress tests to fail.
> So I did some minor refactoring to make the "ORDER BY" deterministic.

Thanks, merged.

--
Thanks, Amit Langote

Attachment Content-Type Size
v6-0002-Disallow-partitionwise-join-when-collation-doesn-.patch application/octet-stream 16.7 KB
v6-0001-Disallow-partitionwise-grouping-when-collation-do.patch application/octet-stream 12.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Shlok Kyal 2024-11-05 07:22:45 Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY