BUG #17746: Partitioning by hash of a text depends on icu version when text collation is not deterministic

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: andrewbille(at)gmail(dot)com
Subject: BUG #17746: Partitioning by hash of a text depends on icu version when text collation is not deterministic
Date: 2023-01-11 15:09:21
Message-ID: 17746-83bc7dd7e202dc70@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17746
Logged by: Andrew Bille
Email address: andrewbille(at)gmail(dot)com
PostgreSQL version: 15.1
Operating system: Ubuntu 20.04
Description:

Hello
I build two postgres (REL_15_STABLE) with different ICU versions (60.3 and
66.1)

In "old-icu-server" I do (exerpt from regression test collate.icu.utf8):

CREATE COLLATION coll (provider = icu, deterministic = false, locale =
'@colStrength=secondary');
CREATE TABLE test (
a integer,
b text COLLATE coll
)
PARTITION BY HASH (b);
CREATE TABLE test_0 (
a integer,
b text COLLATE coll
);
CREATE TABLE test_1 (
a integer,
b text COLLATE coll
);
ALTER TABLE ONLY test ATTACH PARTITION test_0 FOR VALUES WITH (modulus 2,
remainder 0);
ALTER TABLE ONLY test ATTACH PARTITION test_1 FOR VALUES WITH (modulus 2,
remainder 1);
INSERT INTO test VALUES (1, 'def');
SELECT tableoid::regclass::text as part, hashtext(b) as hash, * FROM test;
EXPLAIN SELECT * FROM test WHERE b='def';
SELECT * FROM test WHERE b='def';

and recieve:

CREATE COLLATION
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
ALTER TABLE
INSERT 0 1
part | hash | a | b
--------+------------+---+-----
test_0 | -493883174 | 1 | def
(1 row)

QUERY PLAN
-------------------------------------------------------------
Seq Scan on test_0 test (cost=0.00..25.88 rows=6 width=36)
Filter: (b = 'def'::text)
(2 rows)
a | b
---+-----
1 | def
(1 row)

Then I stop the server and start this cluster with "new-icu-binaries"
(simulating OS upgrade) and do:
SELECT tableoid::regclass::text as part, hashtext(b) as hash, * FROM test;
EXPLAIN SELECT * FROM test WHERE b='def';
SELECT * FROM test WHERE b='def';

I got the following error and incorrect SELECT result:

WARNING: collation "coll" has version mismatch
DETAIL: The collation in the database was created using version 153.80, but
the operating system provides version 153.14.
HINT: Rebuild all objects affected by this collation and run ALTER
COLLATION public.coll REFRESH VERSION, or build PostgreSQL with the right
library version.
part | hash | a | b
--------+------------+---+-----
test_0 | 2087909916 | 1 | def
(1 row)

QUERY PLAN
-------------------------------------------------------------
Seq Scan on test_1 test (cost=0.00..25.88 rows=6 width=36)
Filter: (b = 'def'::text)
(2 rows)

a | b
---+---
(0 rows)

---
If I make a dump in the "old-icu-server" and try to upload to the
"new-icu-server", then I get another failure:

psql:dump.sql:89: ERROR: new row for relation "test_0" violates partition
constraint
DETAIL: Failing row contains (1, def).
CONTEXT: COPY test_0, line 1: "1 def"
COPY 0

This case discovered when trying to restore on OracleLinux 8 a dump produced
on Ubuntu 20.04 by the modified sql/collate.icu.utf8.sql

What bothers me is that partitioning depends on the hash that can be
computed differently with the OS upgrade/migration. Also I was surprised not
to find a description of the hashtext()/hashtextextended() in the docs.

Thanks!

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alex Richman 2023-01-11 15:41:30 Re: Logical Replica ReorderBuffer Size Accounting Issues
Previous Message David G. Johnston 2023-01-11 13:22:26 Re: BUG #17745: RETURNING after INSERT does not return the same structure of response