Re: Slow performance of collate "en_US.utf8"

From: Joe Conway <mail(at)joeconway(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Alexey Borschev <a(dot)borschev(at)postgrespro(dot)ru>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Slow performance of collate "en_US.utf8"
Date: 2025-02-28 20:02:59
Message-ID: eb319bfd-fbfb-42c1-a01c-406988e177b5@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/28/25 09:16, Laurenz Albe wrote:
> On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote:
>> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4.
>
> I'd say that you would have to complain to the authors of the
> GNU C library, which provides this collation.

Yep -- glibc starting with version 2.21 has a massive performance
regression for certain cases and the glibc folks have basically said
they will not fix it. If you try the same thing on RHEL 7.x with glibc
2.17 it will perform about the same as ICU.

If you are using pg17 you should consider using the new builtin
collation provider -- it will perform almost as well as the 'C' locale.
Something like:
--------
CREATE DATABASE builtincoll LOCALE_PROVIDER builtin
BUILTIN_LOCALE 'C.UTF-8' TEMPLATE template0;
--------

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Thomas Munro 2025-02-28 22:49:38 Re: Slow performance of collate "en_US.utf8"
Previous Message Laurenz Albe 2025-02-28 14:16:40 Re: Slow performance of collate "en_US.utf8"