Re: Upgrading locale issues

From: rihad <rihad(at)mail(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Upgrading locale issues
Date: 2019-05-02 06:31:30
Message-ID: c8ceefea-88af-08a9-08ed-a4fc6ed223c7@mail.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 05/02/2019 12:26 AM, Peter Geoghegan wrote:
> On Mon, Apr 29, 2019 at 7:45 AM rihad <rihad(at)mail(dot)ru> wrote:
>> Hi. Today we run pg_ctl promote on a slave server (10.7) and started
>> using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
>> 11.2. And you guessed it, most varchar indexes got corrupted because
>> system local changed in subtle ways. So I created the extension amcheck
>> and reindexed all bad indexes one by one. Is there any way to prevent
>> such things in the future? Will switching to ICU fix all such issues?
> Not necessarily, but it will detect the incompatibility more or less
> automatically, making it far more likely that the problem will be
> caught before it does any harm. ICU versions collations, giving
> Postgres a way to reason about their compatibility over time. The libc
> collations are not versioned, though (at least not in any standard way
> that Postgres can take advantage of).
>
>> The problem with it is that ICU collations are absent in pg_collation,
>> initdb should be run to create them, but pg_basebackup only runs on an
>> empty base directory, so I couldn't run initdb + pg_basebackup to
>> prepare the replica server. I believe I can run the create collation
>> command manually, but what would it look like for en-x-icu?
> It is safe to call pg_import_system_collations() directly, which is
> all that initdb does. This is documented, so you wouldn't be relying
> on a hack.
>
Thanks for the reply. Do you know what would a "decent" ICU collation be
to bind to a field's schema definition so it would mimic a UTF-8
encoding for a multilingual column? Maybe und-x-icu? We aren't as much
concerned about their sortability in most cases, we just want indexes to
better handle future PG/ICU upgrades. But what does und(efined) even
mean with respect to collations? With UTF-8 at least some default
collation is specified, like en_US.UTF-8. Will results be in a
completely undefined order as a result of ORDER BY "icu_und_column"?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Olarte 2019-05-02 07:57:42 Re: Query not producing expected result
Previous Message Michael Nolan 2019-05-02 06:08:37 Re: Starting Postgres when there is no disk space