Quick Links

Re: Optimization for lower(), upper(), casefold() functions.

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Optimization for lower(), upper(), casefold() functions.
Date:	2025-02-18 22:02:25
Message-ID:	d0dd50662da84f582c34247f4ed43061e8d86d34.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
> I tried the approach via a range table. The result was worse than
> without the table. With branching in a function, the result is
> better.
>
> Patch v3 — ranges binary search by branches.
> Patch v4 — ranges binary search by table.

Thoughts on v3:

It looks like the top 5 bits of the offset are unused. What if we used
those bits for flags to indicate:

HAS_LOWER
HAS_UPPER
HAS_FOLD
HAS_SPECIAL
HAS_TITLE

That way, we only need to look in the corresponding table if it
actually has an entry other than the codepoint itself.

It doesn't leave a lot of room if the tables get larger, but if we are
worried about that, we could eliminate HAS_TITLE, because I don't think
the performance for INITCAP() is as important as LOWER/UPPER/CASEFOLD.

Regards,
Jeff Davis

In response to

Re: Optimization for lower(), upper(), casefold() functions. at 2025-02-11 20:08:33 from Alexander Borisov

Responses

Re: Optimization for lower(), upper(), casefold() functions. at 2025-02-18 22:54:35 from Alexander Borisov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-02-18 22:24:06	Re: BUG #18815: Logical replication worker Segmentation fault
Previous Message	Masahiko Sawada	2025-02-18 21:23:20	Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation