From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Joe Conway <mail(at)joeconway(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Order changes in PG16 since ICU introduction |
Date: | 2023-06-12 21:04:04 |
Message-ID: | 78ac19e5-c5c3-c739-3080-ad7ef604df8d@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 09.06.23 02:36, Jeff Davis wrote:
> Patches 0001, 0002:
>
> These patches implement the built-in provider and automatically change
> provider=icu to provider=builtin when the locale is C.
I object to adding a new provider for PG16 (patch 0001). This is
clearly a new feature, which wasn't even contemplated a few weeks ago.
> * Switch to the libc provider for the C locale: would make the libc
> provider even more complicated and had some potential for confusion,
> and also has catalog representation problems when --locale is specified
> along with --lc-ctype.
I don't follow this concern. This could be done entirely within initdb.
I mean, just change the default for --locale-provider if --locale=C is
given. That's like 10 lines of code in initdb.c.
I don't think I want CREATE DATABASE or CREATE COLLATION to have that
logic, nor do they really need it.
> Patch 0003:
>
> Makes LOCALE apply to all providers. The overall feel after this patch
> is that "locale" now means the collation locale, and
> LC_COLLATE/LC_CTYPE are for the server environment. When using libc,
> LC_COLLATE and LC_CTYPE still work as they did before, but their
> relationship to database collation feels more like a special case of
> the libc provider. I believe most people favor this patch and I haven't
> seen recent objections.
This seems reasonable.
> 1. If you specify --locale-provider=builtin at initdb time, you *must*
> specify --locale=C/POSIX, otherwise you get an error.
Shouldn't the --locale option be ignored (or rejected) in that case.
Why insist on it being specified?
> 2. Patch 0004 is possibly out of scope for 16, but it felt consistent
> with the other UI changes and low risk. Please try with/without before
> objecting.
Also clearly a new feature. Also the implications of various upgrade,
dump/restore scenarios are not fully explored.
I think it's an interesting idea, to make CREATE DATABASE and CREATE
COLLATION also default to icu of the respective higher scope has been
set to icu. In fact, this makes me wonder now whether changing the
default to icu in *only* initdb is sensible. But again, we'd need to
see the full matrix of upgrade scenarios laid out here.
> 3. Daniel Verite felt that we should only change the provider from ICU
> to "builtin" for the C locale if the provider is defaulting to ICU; not
> if it's specified as ICU.
Correct, we shouldn't override what was explicitly specified.
> I did not differentiate between specifying
> ICU and defaulting to ICU because:
> a. "libc" unconditionally uses the built-in memcmp() logic for C, it
> never actually uses libc
> b. If a user really wants the root locale or the en-US-u-va-posix
> locale, they can specify those directly
> c. I don't see any plausible case where it helps a user to keep
> provider=icu when locale=C.
If the user specifies that, that's up to them to deal with the outcomes.
Just changing it to something different seems wrong.
> 4. Joe Conway and Peter Eisentraut both felt that C.UTF-8 with
> provider=icu should not be changed to use the builtin provider, and
> instead passed on to ICU. I implemented a compromise where initdb will
> change C.UTF-8 to the built-in provider; but CREATE DATABASE/COLLATION
> will pass it along to ICU (which may support it as en-US-u-va-posix in
> some versions, or may throw an error in other versions). My reasoning
> is that initdb is pulling from the environment, and we should try
> harder to succeed on any reasonable environmental settings (otherwise
> initdb with default settings could fail); whereas we can be more strict
> with CREATE DATABASE/COLLATION.
I'm not objecting to changing anything about C.UTF-8, but I am objecting
to changing anything substantial in PG16.
> 5. For the built-in provider, initdb defaults to UTF-8 rather than
> SQL_ASCII. Otherwise, you would be unable to use ICU at all later,
> because ICU doesn't support SQL_ASCII.
Also a behavior change that is not appropriate for PG16 at this stage.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2023-06-12 21:06:18 | Re: Shouldn't construct_array_builtin and deconstruct_array_builtin agree on types? |
Previous Message | Tomas Vondra | 2023-06-12 20:38:59 | Shouldn't construct_array_builtin and deconstruct_array_builtin agree on types? |