Re: Performance degradation in Index searches with special characters

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Shiv Iyer <shiv(at)minervadb(dot)com>
Cc: Andrey Stikheev <andrey(dot)stikheev(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Performance degradation in Index searches with special characters
Date: 2024-10-06 20:24:15
Message-ID: CA+hUKGJMJbsJA=WyP6e8sNdtxtMnERAZKek8yNgb8Lf0E6fi=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Oct 7, 2024 at 9:02 AM Shiv Iyer <shiv(at)minervadb(dot)com> wrote:
> - As the string length increases, the performance degrades exponentially when using special characters. This is due to the collation’s computational complexity for each additional character comparison.

That's a pretty interesting observation, worthy of a bug report. I
don't know the details offhand but the algorithm used should be
basically the same, or more likely a simpler subset, of what other
collation implementations are using IIRC (don't quote me but I think
it's supposed to be ISO 14651 which is essentially a subset of the UCA
stuff that ICU is using (it is "aligned with" UCA DUCET), without CLDR
customisations and perhaps various other complications), so it doesn't
sound like it should be fundamentally required to be *more* expensive
than ICU...

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Joe Conway 2024-10-07 16:48:02 Re: Performance degradation in Index searches with special characters
Previous Message Shiv Iyer 2024-10-06 20:02:01 Re: Performance degradation in Index searches with special characters