From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
Subject: | Hash join not finding which collation to use for string hashing |
Date: | 2020-01-28 23:36:03 |
Message-ID: | 38121352-48B2-401F-84F3-0E5344D834D8@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
While reviewing the partition-wise join patch, I ran into an issue that exists in master, so rather than responding to that patch, I’m starting this new thread.
I noticed that this seems similar to the problem that was supposed to have been fixed in the "Re: COLLATE: Hash partition vs UPDATE” thread. As such, I’ve included Tom and Amit in the CC list.
Notice the "ERROR: could not determine which collation to use for string hashing”
The following is extracted from the output from the test:
> CREATE TABLE raw_data (a text);
> INSERT INTO raw_data (a) VALUES ('Türkiye'),
> ('TÜRKIYE'),
> ('bıt'),
> ('BIT'),
> ('äbç'),
> ('ÄBÇ'),
> ('aaá'),
> ('coté'),
> ('Götz'),
> ('ὀδυσσεύς'),
> ('ὈΔΥΣΣΕΎΣ'),
> ('を読み取り用'),
> ('にオープンできませんでした');
> -- Create unpartitioned tables for test
> CREATE TABLE alpha (a TEXT COLLATE "ja_JP", b TEXT COLLATE "sv_SE");
> CREATE TABLE beta (a TEXT COLLATE "tr_TR", b TEXT COLLATE "en_US");
> INSERT INTO alpha (SELECT a, a FROM raw_data);
> INSERT INTO beta (SELECT a, a FROM raw_data);
> ANALYZE alpha;
> ANALYZE beta;
> EXPLAIN (COSTS OFF)
> SELECT t1.a, t2.a FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a) WHERE t1.a IN ('äbç', 'ὀδυσσεύς');
> QUERY PLAN
> ------------------------------------------------------------
> Hash Join
> Hash Cond: ((t2.a)::text = (t1.a)::text)
> -> Seq Scan on beta t2
> -> Hash
> -> Seq Scan on alpha t1
> Filter: (a = ANY ('{äbç,ὀδυσσεύς}'::text[]))
> (6 rows)
>
> SELECT t1.a, t2.a FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a) WHERE t1.a IN ('äbç', 'ὀδυσσεύς');
> ERROR: could not determine which collation to use for string hashing
> HINT: Use the COLLATE clause to set the collation explicitly.
>
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0001-WIP-patch-to-demonstrate-problem.patch | application/octet-stream | 4.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kohei KaiGai | 2020-01-29 00:59:57 | Re: Is custom MemoryContext prohibited? |
Previous Message | Tomas Vondra | 2020-01-28 23:24:56 | Re: Complete data erasure |