The icu_multilib
module provides control over the
version (or versions) of the ICU provider library used by
PostgreSQL, which can be different from the
version of ICU with which it was built.
Collations are a product of natural language, and natural language evolves over time; but PostgreSQL depends on stable ordering for structures such as indexes. Newer versions of ICU update the provided collators to adapt to changes in natural language, so it's important to control when and how those new versions of ICU are used to prevent problems such as index corruption.
This module assumes that the necessary versions of ICU are already
available, such as through the operating system's package manager; and
already properly installed in a single location accessible to
PostgreSQL. The configration variable
icu_multilib.library_path
should be set to the location
where these ICU library versions are installed.
icu_multilib
must be loaded via
shared_preload_libraries
.
icu_multilib
ignores any ICU library with a major
version greater than that with which PostgreSQL
was built.
The simplest way to use icu_multilib
is to force ICU
version stability. Version stability means that
PostgreSQL will use the same version of ICU for
all collations, even if the operating system or
PostgreSQL itself is upgraded or moved. The
stable version may or may not be the same as that with which
PostgreSQL was built.
Version stability may be based on both the major and minor version of ICU,
such as 68.2
; or on the major version alone, such as
68
. Specifying both the major and minor versions more
closely controls exactly which ICU library is loaded, but applying minor
version upgrades of the ICU library may interfere with normal
operation. Specifying the major version alone makes applying ICU minor
version updates simpler, but carries a risk of subtle differences in
collation order if the library is updated.
To configure for version stability, first make sure that
icu_multilib.library_path
is set correctly, which you
can observe by executing the
icu_multilib.library_versions()
function. Then set
icu_multilib.default_icu_version
to the stable version,
and set icu_multilib.search_by_collator_version
to
false
. These configuration variables should be set in
postgresql.conf
to ensure that the stable version of
ICU is used consistently.
Ensure that the stable version of the ICU library is always available, including on all replicas and through operating system and PostgreSQL upgrades.
Similarly, keep the settings in postgresql.conf
, and
keep them consistent across replicas and upgrades.
For best results, stabilize the ICU version immediately after
initdb
is run, to ensure that initial collations are
loaded from the stabilized version of the ICU library. To do this, use the
option --no-import-collations
for
initdb
, import the collations manually, and then refresh
the versions for the default collations. For instance:
$ initdb --no-import-collations -D data --locale-provider=icu --icu-locale="en_US" --locale="en_US.UTF-8" $ cat - >> data/postgresql.conf shared_preload_libraries = 'icu_multilib' icu_multilib.library_path = '/path/to/icu/lib' icu_multilib.default_icu_version = '65.1' icu_multilib.search_by_collator_version = false icu_multilib.include_builtin = false ^D $ pg_ctl -D data -l logfile start $ psql template1 WARNING: icu_multilib: collator version mismatch detected for locale "en_US" DETAIL: ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112". WARNING: icu_multilib: collator version mismatch detected for locale "en_US" DETAIL: ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112". WARNING: database "template1" has a collation version mismatch DETAIL: The database was created using collation version 153.112, but the operating system provides version 153.97. HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE template1 REFRESH COLLATION VERSION, or build PostgreSQL with the right library version. =# ALTER DATABASE template1 REFRESH COLLATION VERSION; NOTICE: changing version from 153.112 to 153.97 ALTER DATABASE =# SELECT pg_import_system_collations('pg_catalog'); pg_import_system_collations ----------------------------- 792 (1 row) =# \q $ psql postgres WARNING: icu_multilib: collator version mismatch detected for locale "en_US" DETAIL: ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112". WARNING: icu_multilib: collator version mismatch detected for locale "en_US" DETAIL: ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112". WARNING: database "postgres" has a collation version mismatch DETAIL: The database was created using collation version 153.112, but the operating system provides version 153.97. HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version. =# ALTER DATABASE postgres REFRESH COLLATION VERSION; NOTICE: changing version from 153.112 to 153.97 ALTER DATABASE =# SELECT pg_import_system_collations('pg_catalog'); pg_import_system_collations ----------------------------- 792 (1 row) =# \q $
The result will be a system based on ICU version 65.1
, and
PostgreSQL will use that version of ICU for all
collations that use the ICU provider.
Support for multiple ICU library versions in the same database allows the system to adapt to changes in natural language over time. As these changes are introduced in new versions of ICU, those new versions can be brought into PostgreSQL incrementally without interfering with existing data (or structures like indexes).
Collators provided by ICU also have a version which is distinct from the
version of ICU. For instance, in ICU version 70.1
, the
collator for the en_US
locale has version
153.112
. A collator for a given locale may or may not
be assigned a new version when the ICU library version is updated.
When a collation is created, PostgreSQL obtains
the version of the collator from the ICU library and records it in the
catalog. This recorded version can be updated using ALTER
COLLATION ... REFRESH VERSION
; or ALTER DATABASE
... REFRESH COLLATION VERSION
if it's the database's default
collation. These ALTER
commands obtain the new collator
version from the ICU library identified by
icu_multilib.default_icu_version
if set; otherwise they
obtain the new collator version from the built-in ICU library.
Set icu_multilib.search_by_collator_version
to
true
to enable icu_multilib
to
search (in descending order of ICU major version) among the
available libraries to find the first one that provides a collator with a
version that matches the one recorded in the catalog.
For example, ICU versions 62.2
and
63.2
both provide collator version
153.88
for the locale en_US
; while
ICU version 64.2
provides collator version
153.97
for the same locale. Searching for an ICU
library for a collation with a recorded version of
153.88
will find the ICU library with version
63.2
, because that's the ICU library with the highest
major version (63
) that provides collator version
153.88
. The function
icu_multilib.collator_version_search
simulates this
search for the given locale and version:
SELECT * FROM icu_multilib.collator_version_search('en_US', '153.88'); icu_version | uca_version | collator_version -------------+-------------+------------------ 63.2 | 11.0 | 153.88 (1 row)
Searching by collator version relies on the assumption that collators with the same version behave identically regardless of which ICU library provides it. The administrator must weigh the risks of this assumption against the convenience when managing multiple major versions of ICU.
icu_multilib.library_path
(string
)
The filesystem path in which to search for ICU libraries. Must be set
for icu_multilib
to be useful. Multiple major
versions of ICU may be present; but for each major version, only a
single minor version can be present.
icu_multilib.version_mismatch_log_level
(string
)
A version mismatch happens when the version of a collator provided by the ICU library is different than then collator version recorded when the collation was created (or last refreshed).
When a version mismatch is detected, a log message is emitted at this
level. WARNING
is the default. If the level is set to
ERROR
, it may interfere with normal operation.
icu_multilib.library_search_log_level
(string
)
icu_multilib
emits messages at this level while
determining which ICU library to choose. The default is
DEBUG1
. This setting is useful to diagnose problems
when a suitable ICU library is not found.
icu_multilib.default_icu_version
(string
)
Identifies the version of the ICU library to choose; or, if icu_multilib.search_by_collator_version
is true
, the ICU library to choose if no identical match to the collator version is found. The default is the empty string, meaning to use the built-in ICU library.
Valid values are either of the form major.minor
such
as 70.1
; or a major version number alone, such as
70
.
icu_multilib.include_builtin
(string
)
Determines whether the built-in version of ICU is included when
searching for a suitable ICU library. The default is
true
.
Even if set to false
, the built-in ICU library may be
used if no other suitable ICU library is identified and
icu_multilib.default_icu_version
is set to the empty
string.
icu_multilib.search_by_collator_version
(string
)
Causes icu_multilib
to identify the ICU library
with the highest major version that offers the required collator with an
exactly-matching collator version. The default is false
.
When set to true
, icu_multilib
relies on equal collator versions to produce identical collation order.
If no ICU library is found with an exactly-matching collator version,
icu_multilib
will fall back to the
icu_multilib.default_icu_version
; or if not set, to
the built-in ICU library.
Table F.9. icu_multilib
Functions
Function Description |
---|
Returns details for each available ICU library found in
|
Returns details for all available collators provided by the ICU library
with the given |
Performs a search for the appropriate ICU library given the locale name and requested collator version, and returns details about the ICU library.
If
If |
Returns details about available collators for the given
|
SELECT icu_version, unicode_version, cldr_version FROM icu_multilib.library_versions() LIMIT 3; icu_version | unicode_version | cldr_version -------------+-----------------+-------------- 70.1 | 14.0 | 40.0 69.1 | 13.0 | 39.0 68.2 | 13.0 | 38.1 (3 rows)
SELECT * FROM icu_multilib.collator_versions('en_US') WHERE icu_version BETWEEN '60.0' and '65.0'; icu_version | uca_version | collator_version -------------+-------------+------------------ 64.2 | 12.1 | 153.97 63.2 | 11.0 | 153.88 62.2 | 11.0 | 153.88 61.2 | 10.0 | 153.80 60.3 | 10.0 | 153.80 (5 rows)
SELECT * FROM icu_multilib.collator_version_search('en_US', '153.97'); icu_version | uca_version | collator_version -------------+-------------+------------------ 65.1 | 12.1 | 153.97 (1 row)
Jeff Davis <jdavis@postgresql.org>
, Thomas Munro <tmunro@postgresql.org>