Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Date: 2024-11-01 10:08:00
Message-ID: CA+HiwqH_XkE2RyYtU_TKfASeV7+q3GLDzM3ssH7pc0eu5Fh-fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 1, 2024 at 5:08 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> just a quick reply while testing v4-0001.
> tests copy from src/test/regress/sql/partition_aggregate.sql first 40 lines.
>
> drop table if exists pagg_tab;
> CREATE TABLE pagg_tab (a int, b int, c text, d int) PARTITION BY
> LIST(c collate "C");
> CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('0000',
> '0001', '0002', '0003', '0004');
> CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('0005',
> '0006', '0007', '0008');
> CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('0009',
> '0010', '0011');
> INSERT INTO pagg_tab SELECT (i % 20), (i % 30), to_char(i % 12,
> 'FM0000'), i % 30 FROM generate_series(0, 2999) i;
> ANALYZE pagg_tab;
>
> EXPLAIN (COSTS OFF, settings)
> SELECT a, sum(b), avg(b), count(*), max(b) FROM pagg_tab GROUP BY a;
>
>
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
> Finalize HashAggregate
> Group Key: pagg_tab.a
> -> Append
> -> Partial HashAggregate
> Group Key: pagg_tab.a
> -> Seq Scan on pagg_tab_p1 pagg_tab
> -> Partial HashAggregate
> Group Key: pagg_tab_1.a
> -> Seq Scan on pagg_tab_p2 pagg_tab_1
> -> Partial HashAggregate
> Group Key: pagg_tab_2.a
> -> Seq Scan on pagg_tab_p3 pagg_tab_2
> Settings: enable_partitionwise_aggregate = 'on',
> enable_partitionwise_join = 'on', max_parallel_workers_per_gather =
> '0', enable_increm
> ental_sort = 'off'
>
>
>
> drop table if exists pagg_tab;
> CREATE TABLE pagg_tab (a text, b int, c text, d int) PARTITION BY
> LIST(c collate "C");
> CREATE TABLE pagg_tab_p1 PARTITION OF pagg_tab FOR VALUES IN ('0000',
> '0001', '0002', '0003', '0004');
> CREATE TABLE pagg_tab_p2 PARTITION OF pagg_tab FOR VALUES IN ('0005',
> '0006', '0007', '0008');
> CREATE TABLE pagg_tab_p3 PARTITION OF pagg_tab FOR VALUES IN ('0009',
> '0010', '0011');
> INSERT INTO pagg_tab SELECT (i % 20)::text, (i % 30), to_char(i % 12,
> 'FM0000'), i % 30 FROM generate_series(0, 2999) i;
> ANALYZE pagg_tab;
> EXPLAIN (COSTS OFF, settings)
> SELECT a, sum(b), avg(b), count(*), max(b) FROM pagg_tab GROUP BY a;
>
>
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate
> Group Key: pagg_tab.a
> -> Append
> -> Seq Scan on pagg_tab_p1 pagg_tab_1
> -> Seq Scan on pagg_tab_p2 pagg_tab_2
> -> Seq Scan on pagg_tab_p3 pagg_tab_3
> Settings: enable_partitionwise_aggregate = 'on',
> enable_partitionwise_join = 'on', max_parallel_workers_per_gather =
> '0', enable_increm
> ental_sort = 'off'
>
>
> it seems "PARTITION BY LIST(c collate "C");" collation compare with
> "GROUP BY a;".
> set collation_incompatible returned true.
> make it cannot do PARTITIONWISE_AGGREGATE_PARTIAL.
>
> but here "group by a", "a" is text data type, we can still do
> PARTITIONWISE_AGGREGATE_PARTIAL
> ?

Good catch. Looks like I added a bug in group_by_has_partkey() --
collation_incompatible should be set only when a grouping expression
matches a partition key.

--
Thanks, Amit Langote

Attachment Content-Type Size
v5-0002-Disallow-partitionwise-join-when-collation-doesn-.patch application/octet-stream 12.2 KB
v5-0001-Disallow-partitionwise-grouping-when-collation-do.patch application/octet-stream 12.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Alexander Lakhin 2024-11-01 10:00:00 Re: Improving tracking/processing of buildfarm test failures