Re: ICU for global collation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: m(dot)polyakova(at)postgrespro(dot)ru
Cc: pgsql-hackers(at)postgresql(dot)org, pryzby(at)telsasoft(dot)com, rjuju123(at)gmail(dot)com, daniel(at)manitou-mail(dot)org, AndrewBille(at)gmail(dot)com, michael(at)paquier(dot)xyz, peter(dot)eisentraut(at)enterprisedb(dot)com
Subject: Re: ICU for global collation
Date: 2022-09-16 08:11:16
Message-ID: 20220916.171116.507864395269358149.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 16 Sep 2022 09:49:28 +0300, Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru> wrote in
> On 2022-09-15 09:52, Kyotaro Horiguchi wrote:
> > However, when I reran the command, it complains about incompatible
> > encoding this time. I think it's more user-friendly to check for the
> > encoding compatibility before the check for missing --icu-locale
> > option.
> > regards.
>
> In continuation of options check: AFAICS the following checks in
> initdb
>
> if (locale_provider == COLLPROVIDER_ICU)
> {
> if (!icu_locale)
> pg_fatal("ICU locale must be specified");
>
> /*
> * In supported builds, the ICU locale ID will be checked by the
> * backend during post-bootstrap initialization.
> */
> #ifndef USE_ICU
> pg_fatal("ICU is not supported in this build");
> #endif
> }
>
> are executed approximately when they are executed in create database
> after getting all the necessary data from the template database:

initdb doesn't work that way, but anyway, I realized that I am
proposing to move that code in setlocales() to the caller function as
the result. I don't think setlocales() is the place for the code
because icu locale has no business with what the function does. That
being said there's no obvious reason we *need* to move the code out to
its caller.

> if (dblocprovider == COLLPROVIDER_ICU)
> {
> /*
> * This would happen if template0 uses the libc provider but the new
> * database uses icu.
> */
> if (!dbiculocale)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("ICU locale must be specified")));
> }
>
> if (dblocprovider == COLLPROVIDER_ICU)
> check_icu_locale(dbiculocale);
>
> But perhaps the check that --icu-locale cannot be specified unless
> locale provider icu is chosen should also be moved here? So all these
> checks will be in one place and it will use the provider from the
> template database (which could be icu):
>
> $ initdb --locale-provider icu --icu-locale en-US -D data &&
> pg_ctl -D data -l logfile start &&
> createdb --icu-locale ru-RU --template template0 mydb
> ...
> createdb: error: database creation failed: ERROR: ICU locale cannot be
> specified unless locale provider is ICU

And, I realized that this causes bigger churn than I thought. So, I'm
sorry but I withdraw the comment.

Thus the first proposed patch will be more or less the direction we
would go. And the patch looks good to me as a whole.

+ errmsg("encoding \"%s\" is not supported with ICU provider",

+ pg_log_error("encoding \"%s\" is not supported with ICU provider",
+ pg_encoding_to_char(encodingid));

I might be wrong, but the messages look wrong to me. The alternatives
below might work.

"encoding \"%s\" is not supported by ICU"
"encoding \"%s\" cannot be used for/with ICU locales"

+ pg_log_error_hint("Rerun %s and choose a matching combination.",
+ progname);

This doesn't seem to provide users with useful information.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-09-16 08:14:25 Re: ICU for global collation
Previous Message Peter Eisentraut 2022-09-16 07:57:39 Re: ICU for global collation