Re: Optimization for lower(), upper(), casefold() functions.

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Alexander Borisov <lex(dot)borisov(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimization for lower(), upper(), casefold() functions.
Date: 2025-03-14 11:16:42
Message-ID: 90da84b8-8a89-4ae3-a970-120edc5435c8@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14/03/2025 05:43, Jeff Davis wrote:
> On Wed, 2025-03-12 at 23:39 +0300, Alexander Borisov wrote:
>> v5 attached.
>
> Attached v6j.
>
> * marked arrays as "static const" rather than just "static"
> * ran pgindent
> * changed data types where appropriate (uint32->pg_wchar)
> * modified perl code so that it produces code that's already pgindented
> * cleanup of perl code, removing unnecessary subroutines and variables
> * added a few comments
> * ran pgperltidy
>
> Some of the perl code working with ranges still needs further cleanup
> and explanation, though.
>
> Also, I ran some of my own simple tests (mostly ASCII) and it showed
> over 10% speedup. That combined with the smaller table sizes makes this
> well worth it.

Looks good overall.

> static const pg_wchar case_map_lower[1677] =
> {
> 0x000000, /* U+000000 */
> 0x000000, /* U+000000 */
> 0x000001, /* U+000001 */
> 0x000002, /* U+000002 */

The duplicated 0x000000 looks wrong. I understand that the 0'th entry is
reserved, and the actual codepoints start at index 1, but the /*
U+000000 */ comment on the 0'th entry is misleading.

> static const uint8 case_map_special[1677] =
> {
> 0x000000, /* U+000000 */
> 0x000000, /* U+000000 */
> ...

0x000000 implies an 24-bit integer, but these are uint8's. Let's use
plain base-10 decimals here rather than hex, like in 'case_map'.

Attached are fixes for those and some other minor things.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
0001-minor-fixes-in-the-perl-script.patch text/x-patch 1.8 KB
0002-use-decimal-for-case_map_special-indexes.patch text/x-patch 94.1 KB
0003-use-better-comment-for-the-0th-reserved-entry-in-the.patch text/x-patch 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-03-14 11:16:43 Re: bogus error message for ALTER TABLE ALTER CONSTRAINT
Previous Message vignesh C 2025-03-14 10:31:13 Re: doc: Mention clock synchronization recommendation for hot_standby_feedback