From: | Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, pryzby(at)telsasoft(dot)com, rjuju123(at)gmail(dot)com, daniel(at)manitou-mail(dot)org, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz, peter(dot)eisentraut(at)enterprisedb(dot)com |
Subject: | Re: ICU for global collation |
Date: | 2022-09-15 15:41:31 |
Message-ID: | 2322791a368e9ad066edd648790b5e91@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2022-09-15 09:52, Kyotaro Horiguchi wrote:
> If I executed initdb as follows, I would be told to specify
> --icu-locale option.
>
>> $ initdb --encoding sql-ascii --locale-provider icu hoge
>> ...
>> initdb: error: ICU locale must be specified
>
> However, when I reran the command, it complains about incompatible
> encoding this time. I think it's more user-friendly to check for the
> encoding compatibility before the check for missing --icu-locale
> option.
>
> regards.
I agree with you. Here's another version of the patch. The
locale/encoding checks and reports in initdb have been reordered,
because now the encoding is set first and only then the ICU locale is
checked.
P.S. While working on the patch, I discovered that UTF8 encoding is
always used for the ICU provider in initdb unless it is explicitly
specified by the user:
if (!encoding && locale_provider == COLLPROVIDER_ICU)
encodingid = PG_UTF8;
IMO this creates additional errors for locales with other encodings:
$ initdb --locale de_DE(dot)iso885915(at)euro --locale-provider icu
--icu-locale de-DE
...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that
the selected locale uses (LATIN9) do not match. This would lead to
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding
explicitly, or choose a matching combination.
And ICU supports many encodings, see the contents of pg_enc2icu_tbl in
encnames.c...
--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
v2-diff_check_icu_encoding.patch | text/x-diff | 6.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2022-09-15 16:11:42 | Re: pgsql: Doc: Explain about Column List feature. |
Previous Message | Reid Thompson | 2022-09-15 14:58:19 | Re: Add the ability to limit the amount of memory that can be allocated to backends. |