From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | pg_upgrade fails to preserve old versions of the predefined collations |
Date: | 2019-11-28 20:08:08 |
Message-ID: | 01adb62f-b114-1237-63ae-b7f030a18389@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello hackers,
When dealing with an OS upgrade, a some kind of anomaly related to
collations was found.
Suppose, we have Debian 8 with postgresql 12 installed.
Then we create a custom collation:
CREATE COLLATION russian (provider=icu, locale='ru_RU');
and
SELECT oid, collname, collnamespace, collprovider, collversion FROM
pg_collation WHERE collname like 'ru%';
returns
12847 | ru-RU-x-icu | 11 | i | 58.0.0.50
...
16384 | russian | 2200 | i | 58.0.0.50
Then let's create two tables with text columns and indexes and fill them
with some data:
CREATE TABLE test_icu_ru (f1 varchar COLLATE "ru-RU-x-icu", i int);
INSERT INTO test_icu_ru SELECT chr(x), x FROM generate_series(1, 2000)
as y(x); CREATE INDEX ON test_icu_ru (f1);
CREATE TABLE test_icu_russian (f1 varchar COLLATE "russian", i int);
INSERT INTO test_icu_russian SELECT chr(x), x FROM generate_series(1,
2000) as y(x); CREATE INDEX ON test_icu_russian (f1);
Perform two test queries:
postgres=# select * from test_icu_ru where f1=chr(821);
f1 | i
----+-----
̵ | 821
(1 row)
postgres=# select * from test_icu_russian where f1=chr(821);
f1 | i
----+-----
̵ | 821
(1 row)
postgres=# EXPLAIN select * from test_icu_ru where f1=chr(821);
QUERY
PLAN
--------------------------------------------------------------------------------------
Index Scan using test_icu_ru_f1_idx on test_icu_ru (cost=0.28..8.29
rows=1 width=6)
Index Cond: ((f1)::text = '̵'::text)
(2 rows)
postgres=# EXPLAIN select * from test_icu_russian where f1=chr(821);
QUERY
PLAN
------------------------------------------------------------------------------------------------
Index Scan using test_icu_russian_f1_idx on test_icu_russian
(cost=0.28..8.29 rows=1 width=6)
Index Cond: ((f1)::text = '̵'::text)
(2 rows)
(The indexes are indeed used by the above queries.)
Now suppose that the OS is upgraded to Debian 9 (or the pgdata just
moved to Debian 9 with the postgresql 12).
The same queries return:
postgres=# select * from test_icu_ru where f1=chr(821);
WARNING: collation "ru-RU-x-icu" has version mismatch
DETAIL: The collation in the database was created using version
58.0.0.50, but the operating system provides version 153.64.29.
HINT: Rebuild all objects affected by this collation and run ALTER
COLLATION pg_catalog."ru-RU-x-icu" REFRESH VERSION, or build PostgreSQL
with the right library version.
f1 | i
----+---
(0 rows)
postgres=# select * from test_icu_russian where f1=chr(821);
WARNING: collation "russian" has version mismatch
DETAIL: The collation in the database was created using version
58.0.0.50, but the operating system provides version 153.64.29.
HINT: Rebuild all objects affected by this collation and run ALTER
COLLATION public.russian REFRESH VERSION, or build PostgreSQL with the
right library version.
f1 | i
----+---
(0 rows)
We get no data due to the real collation/sort order change but the
warning says what to do.
The query presented at
https://www.postgresql.org/docs/12/sql-altercollation.html returns:
Collation | Object
-------------------------+-------------------------------------
collation "ru-RU-x-icu" | column f1 of table test_icu_ru
collation "ru-RU-x-icu" | index test_icu_ru_f1_idx
collation russian | column f1 of table test_icu_russian
collation russian | index test_icu_russian_f1_idx
So the documented behavior is observed.
But after pg_upgrade:
pg_createcluster 12 new
/usr/lib/postgresql/12/bin/pg_upgrade -b /usr/lib/postgresql/12/bin -B
/usr/lib/postgresql/12/bin -d /etc/postgresql/12/main -D
/etc/postgresql/12/new
In the new cluster the same queries return:
postgres=# select * from test_icu_russian where f1=chr(821);
WARNING: collation "russian" has version mismatch
DETAIL: The collation in the database was created using version
58.0.0.50, but the operating system provides version 153.64.29.
HINT: Rebuild all objects affected by this collation and run ALTER
COLLATION public.russian REFRESH VERSION, or build PostgreSQL with the
right library version.
f1 | i
----+---
(0 rows)
postgres=# select * from test_icu_ru where f1=chr(821);
f1 | i
----+---
(0 rows)
(There is no warning for the predefined collation now.)
The query presented at
https://www.postgresql.org/docs/12/sql-altercollation.html returns:
Collation | Object
-------------------+-------------------------------------
collation russian | column f1 of table test_icu_russian
collation russian | index test_icu_russian_f1_idx
(2 rows)
and
SELECT oid, collname, collnamespace, collprovider, collversion FROM
pg_collation WHERE collname like 'ru%';
returns
oid | collname | collnamespace | collprovider | collversion
-------+-------------+---------------+--------------+-------------
12884 | ru-BY-x-icu | 11 | i | 153.64.29
12885 | ru-KG-x-icu | 11 | i | 153.64.29
12886 | ru-KZ-x-icu | 11 | i | 153.64.29
12887 | ru-MD-x-icu | 11 | i | 153.64.29
12888 | ru-RU-x-icu | 11 | i | 153.64.29
12889 | ru-UA-x-icu | 11 | i | 153.64.29
12883 | ru-x-icu | 11 | i | 153.64.29
12329 | ru_RU | 11 | c |
12328 | ru_RU.utf8 | 11 | c |
16402 | russian | 2200 | i | 58.0.0.50
So only the custom collation' version is actual, but predefined ones
correspond to a newer libicu, but not to actual data.
For all that, REINDEX repairs both indexes:
postgres=# REINDEX INDEX test_icu_russian_f1_idx;
REINDEX
postgres=# select * from test_icu_russian where f1=chr(821);
f1 | i
----+-----
̵ | 821
(1 row)
postgres=# REINDEX INDEX test_icu_ru_f1_idx;
REINDEX
postgres=# select * from test_icu_ru where f1=chr(821);
f1 | i
----+-----
̵ | 821
(1 row)
So for now it seems dangerous to use predefined collations as their old
versions are not preserved by pg_upgrade and the user doesn't know which
indexes affected by the actual ICU collation changes.
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2019-11-28 20:25:02 | Re: pg_upgrade fails to preserve old versions of the predefined collations |
Previous Message | Mark Dilger | 2019-11-28 18:51:48 | Do XID sequences need to be contiguous? |