Re: Performance degradation in Index searches with special characters

From: Joe Conway <mail(at)joeconway(dot)com>
To: Andrey Stikheev <andrey(dot)stikheev(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Performance degradation in Index searches with special characters
Date: 2024-10-06 17:45:15
Message-ID: 1d6a52ad-6c67-4bbf-80d8-ab28040851e9@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 10/6/24 13:28, Andrey Stikheev wrote:
> Thanks for your feedback. After looking into it further, it seems the
> performance issue is indeed related to the default collation settings,
> particularly when handling certain special characters like |<| in the
> glibc |strcoll_l| function. This was confirmed during my testing on
> Debian 12  with glibc version 2.36 (this OS and glibc are being used in
> our office's Docker image: https://hub.docker.com/_/postgres <https://
> hub.docker.com/_/postgres>).

This is not surprising. There is a performance regression that started
in glibc 2.21 with regard to sorting unicode. Test with RHEL 7.x (glibc
2.17) and I bet you will see comparable results to ICU. The best answer
in the long term, IMHO, is likely to use the new built-in collation just
released in Postgres 17.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2024-10-06 18:13:14 Re: Performance degradation in Index searches with special characters
Previous Message Andrey Stikheev 2024-10-06 17:28:29 Re: Performance degradation in Index searches with special characters