From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Cc: | Jeremy Schneider <schneider(at)ardentperf(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, "Nasby, Jim" <nasbyj(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Collation version tracking for macOS |
Date: | 2022-10-21 21:24:06 |
Message-ID: | CA+hUKGL36vXMfcaDq+U1ZkoSsdfFnNx7GxhGM7aYzEbKs1W0=Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Here is a rebase of this experimental patch. I think the basic
mechanics are promising, but we haven't agreed on a UX. I hope we can
figure this out.
Restating the choice made in this branch of the experiment: Here I
try to be just like DB2 (if I understood its manual correctly).
In DB2, you can use names like "en_US" if you don't care about
changes, and names like "CLDR181_en_US" if you do. It's the user's
choice to use the second kind to avoid "unexpected effects on
applications or database objects" after upgrades. Translated to
PostgreSQL concepts, you can use a database default ICU locale like
"en-US" if you don't care and "67:en-US" if you do, and for COLLATION
objects it's the same. The convention I tried in this patch is that
you use either "en-US-x-icu" (which points to "en-US") or
"en-US-x-icu67" (which points to "67:en-US") depending on whether you
care about this problem.
I recognise that this is a bit cheesy, it's all the user's problem to
deal with or ignore.
An alternative mentioned by Peter E was that the locale names
shouldn't carry the prefix, but somehow we should have a list of ICU
versions to search for a matching datcollversion/collversion. How
would that look? Perhaps a GUC, icu_library_versions = '63, 67, 71'?
There is a currently natural and smallish range of supported versions,
probably something like 54 ... U_ICU_VERSION_MAJOR_NUM, but it seems a
bit weird to try to dlopen ~25 libraries or whatever it might be...
Do you think we should try to code this up?
I haven't tried it, but the main usability problem I predict with that
idea is this: It can cope with a scenario where you created a
database with ICU 63 and started using a default of "en" and maybe
some explicit fr-x-icu or whatever, and then you upgrade to a new
postgres binary using ICU 71, and, as long as you still have ICU 63
installed it'll just magicaly keep using 63, now via dlopen(). But it
doesn't provide a way for me to create a new database that uses 63 on
purpose when I know what I'm doing. There are various reasons I might
want to do that.
Maybe the ideas could be combined? Perhaps "en" means "create using
binary's linked ICU, open using search-by-collversion", while "67:en"
explicitly says which to use?
Changes since last version:
* Now it just uses the default dlopen() search path, unless you set
icu_library_path. Is that a security problem? It's pretty
convenient, because it means you can just "apt-get install libicu63"
(or local equivalent) and that's all, now 63 is available.
* To try the idea out, I made it automatically create "*-x-icu67"
alongside the regular "-x-icu" collation objects at initdb time.
Attachment | Content-Type | Size |
---|---|---|
v5-0001-WIP-Multi-version-ICU.patch | application/x-patch | 30.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nikita Malakhov | 2022-10-21 22:36:31 | Re: Pluggable toaster |
Previous Message | Peter Eisentraut | 2022-10-21 19:17:47 | Re: refactor ownercheck and aclcheck functions |