From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Daniel Verite <daniel(at)manitou-mail(dot)org>, pryzby(at)telsasoft(dot)com |
Subject: | Re: ICU for global collation |
Date: | 2022-08-17 16:53:59 |
Message-ID: | 20220817165359.mcco6ufe2vchokqo@jrouhaud |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Mon, Aug 15, 2022 at 03:06:32PM +0300, Marina Polyakova wrote:
>
> 1.1) It looks like there's a bug in the function get_db_infos
> (src/bin/pg_upgrade/info.c), where the version of the old cluster is always
> checked:
>
> if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "'c' AS datlocprovider, NULL AS daticulocale, ");
> else
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "datlocprovider, daticulocale, ");
>
> With the simple patch
>
> diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
> index df374ce4b362b4c6c87fc1fd0e476e5d6d353d9e..53ea348e211d3ac38334292bc16cb814bc13bb87
> 100644
> --- a/src/bin/pg_upgrade/info.c
> +++ b/src/bin/pg_upgrade/info.c
> @@ -319,7 +319,7 @@ get_db_infos(ClusterInfo *cluster)
>
> snprintf(query, sizeof(query),
> "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, ");
> - if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
> + if (GET_MAJOR_VERSION(cluster->major_version) < 1500)
> snprintf(query + strlen(query), sizeof(query) - strlen(query),
> "'c' AS datlocprovider, NULL AS daticulocale, ");
> else
>
> I got the expected error during the upgrade:
>
> locale providers for database "template1" do not match: old "libc", new
> "icu"
> Failure, exiting
Good catch. There's unfortunately not a lot of regression tests for
ICU-initialized clusters. I'm wondering if the build-farm client could be
taught about the locale provider rather than assuming libc. Clearly that
wouldn't have caught this issue, but it should still increase the coverage a
bit (I'm thinking of the recent problem with the abbreviated keys).
> 1.2) It looks like the mentioned asserion in dbcommands.c conflicts with the
> following lines earlier:
>
> if (dbiculocale == NULL)
> dbiculocale = src_iculocale;
>
> The following patch works for me:
>
> diff --git a/src/backend/commands/dbcommands.c
> b/src/backend/commands/dbcommands.c
> index b31a30550b025d48ba3cc250dc4c15f41f9a80be..17a2942341e528c01182fb6d4878580f2706bec9
> 100644
> --- a/src/backend/commands/dbcommands.c
> +++ b/src/backend/commands/dbcommands.c
> @@ -1048,6 +1048,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("ICU locale must be specified")));
> }
> + else
> + dbiculocale = NULL;
>
> if (dblocprovider == COLLPROVIDER_ICU)
> check_icu_locale(dbiculocale);
I think it would be better to do that in the various variable initialization.
Maybe switch the dblocprovider and dbiculocale initialization, and only
initialize dbiculocale if dblocprovider is COLLPROVIDER_ICU.
> 2) CREATE DATABASE does not always require the icu locale unlike initdb and
> createdb:
>
> $ createdb mydb --locale en_US.UTF-8 --template template0 --locale-provider
> icu
> createdb: error: database creation failed: ERROR: ICU locale must be
> specified
>
> $ psql -c "CREATE DATABASE mydb LOCALE \"en_US.UTF-8\" TEMPLATE template0
> LOCALE_PROVIDER icu" postgres
> CREATE DATABASE
>
> I'm wondering if this is not a fully-supported feature (because createdb
> creates an SQL command with LC_COLLATE and LC_CTYPE options instead of
> LOCALE option) or is it a bug in CREATE DATABASE?.. From
> src/backend/commands/dbcommands.c:
>
> if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
> {
> if (dlocale && dlocale->arg)
> dbiculocale = defGetString(dlocale);
> }
This discrepancy between createdb and CREATE DATABASE looks like an oversight,
as createdb indeed interprets --locale as:
if (locale)
{
if (lc_ctype)
pg_fatal("only one of --locale and --lc-ctype can be specified");
if (lc_collate)
pg_fatal("only one of --locale and --lc-collate can be specified");
lc_ctype = locale;
lc_collate = locale;
}
AFAIK the fallback in the CREATE DATABASE case is expected as POSIX locale
names should be accepted by icu, so this should work for createdb too.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-08-17 17:44:23 | Re: pg_walinspect: ReadNextXLogRecord's first_record argument |
Previous Message | Bharath Rupireddy | 2022-08-17 16:28:56 | Re: pg_walinspect: ReadNextXLogRecord's first_record argument |