Re: Optimization for lower(), upper(), casefold() functions.

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimization for lower(), upper(), casefold() functions.
Date: 2025-03-12 04:05:13
Message-ID: 4f3772355e038acd3cfe0be43ae2c1aacae1794d.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2025-03-02 at 23:33 +0300, Alexander Borisov wrote:
> Did you have a time for review this?
>
> I'd like to continue improving Unicode in Postgres, as I previously
> wrote, next in my plans are Normalization forms, and more.
> But now I am blocked by this patch.

Hi,

I have refactored unicode_case.c a bit (v3j-0001) and rebased your v3
work on top of that (v3j-0002).

The refactoring is so that the optimizations do not need to modify
convert_case, which is already complex and I'd like to avoid adding
more to that function. Instead, I created a casemap() function, which
maps a single chracter, and convert_case() calls that.

I didn't test the refactoring for performance, but it looks as
optimizable as what was there before.

A couple questions:

* Is there a reason the fast-path for codepoints < 0x80 is in
unicode_case.c rather than unicode_case_func.h?

* Is there a reason you defined case_index() as static rather than
static inline?

* Is there a reason to have a new file unicode_case_func.h rather than
just add it to unicode_case_table.h?

I'm looking at a few more details, but this is a low-risk change
because there are exhaustive tests, so I intend to commit something
like this soon.

Regards,
Jeff Davis

Attachment Content-Type Size
v3j-0001-Refactor-convert_case-to-prepare-for-optimizatio.patch text/x-patch 6.0 KB
v3j-0002-Optimization-for-lower-upper-casefold-functions.patch text/x-patch 703.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-03-12 04:10:51 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Previous Message Noah Misch 2025-03-12 03:57:43 Re: AIO v2.5