Re: Strangeness with UNIQUE indexes and UTF-8

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: Omar Kilani <omar(dot)kilani(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Strangeness with UNIQUE indexes and UTF-8
Date: 2021-06-06 16:36:26
Message-ID: 60BCF98A.2080101@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/06/21 11:08, Omar Kilani wrote:
> I'm running pg_verify_checksums on the cluster, but the database is
> many TB so it'll be a bit.

Index corruption because of a locale change would not be the sort of thing
checksums would detect. Entries would be put into the index in the correct
order according to the old collation. The same entries can be still there,
intact, just fine according to the checksums, only the new collation would
have put them in a different order. Index search algorithms that are fast,
because they assume the entries to be correctly ordered, will skip regions
of the index where the desired key "couldn't possibly be", and if that's
where the old ordering put it, it won't be found.

Regards,
-Chap

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-06 16:38:26 Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
Previous Message Justin Pryzby 2021-06-06 16:35:31 pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic