From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Optimization for lower(), upper(), casefold() functions. |
Date: | 2025-02-18 22:02:25 |
Message-ID: | d0dd50662da84f582c34247f4ed43061e8d86d34.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
> I tried the approach via a range table. The result was worse than
> without the table. With branching in a function, the result is
> better.
>
> Patch v3 — ranges binary search by branches.
> Patch v4 — ranges binary search by table.
Thoughts on v3:
It looks like the top 5 bits of the offset are unused. What if we used
those bits for flags to indicate:
HAS_LOWER
HAS_UPPER
HAS_FOLD
HAS_SPECIAL
HAS_TITLE
That way, we only need to look in the corresponding table if it
actually has an entry other than the codepoint itself.
It doesn't leave a lot of room if the tables get larger, but if we are
worried about that, we could eliminate HAS_TITLE, because I don't think
the performance for INITCAP() is as important as LOWER/UPPER/CASEFOLD.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-02-18 22:24:06 | Re: BUG #18815: Logical replication worker Segmentation fault |
Previous Message | Masahiko Sawada | 2025-02-18 21:23:20 | Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation |