From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | m(dot)polyakova(at)postgrespro(dot)ru |
Cc: | pgsql-hackers(at)postgresql(dot)org, pryzby(at)telsasoft(dot)com, rjuju123(at)gmail(dot)com, daniel(at)manitou-mail(dot)org, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz, peter(dot)eisentraut(at)enterprisedb(dot)com |
Subject: | Re: ICU for global collation |
Date: | 2022-09-16 04:55:19 |
Message-ID: | 20220916.135519.1552320805811493586.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Thu, 15 Sep 2022 18:41:31 +0300, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru> wrote in
> P.S. While working on the patch, I discovered that UTF8 encoding is
> always used for the ICU provider in initdb unless it is explicitly
> specified by the user:
>
> if (!encoding && locale_provider == COLLPROVIDER_ICU)
> encodingid = PG_UTF8;
>
> IMO this creates additional errors for locales with other encodings:
>
> $ initdb --locale de_DE(dot)iso885915(at)euro --locale-provider icu
> --icu-locale de-DE
> ...
> initdb: error: encoding mismatch
> initdb: detail: The encoding you selected (UTF8) and the encoding that
> the selected locale uses (LATIN9) do not match. This would lead to
> misbehavior in various character string processing functions.
> initdb: hint: Rerun initdb and either do not specify an encoding
> explicitly, or choose a matching combination.
>
> And ICU supports many encodings, see the contents of pg_enc2icu_tbl in
> encnames.c...
It seems to me the best default that fits almost all cases using icu
locales.
So, we need to specify encoding explicitly in that case.
$ initdb --encoding iso-8859-15 --locale de_DE(dot)iso885915(at)euro --locale-provider icu --icu-locale de-DE
However, I think it is hardly understantable from the documentation.
(I checked this using euc-jp [1] so it might be wrong..)
[1] initdb --encoding euc-jp --locale ja_JP.eucjp --locale-provider icu --icu-locale ja-x-icu
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-09-16 05:37:17 | Re: START_REPLICATION SLOT causing a crash in an assert build |
Previous Message | Ken Kato | 2022-09-16 04:23:06 | Re: Add last_vacuum_index_scans in pg_stat_all_tables |