Re: tiny step toward threading: reduce dependence on setlocale()

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: tiny step toward threading: reduce dependence on setlocale()
Date: 2024-08-08 21:09:53
Message-ID: b961acfc-341c-4693-b944-2dddc9dcfdc9@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07.08.24 22:44, Peter Eisentraut wrote:
> (Now that I look at it, pg_tolower() has some short-circuiting for ASCII
> letters, so it would not handle Turkish-i correctly if that had been the
> global locale.  By removing the use of pg_tolower(), we fix that issue
> in passing.)

It occurred to me that this issue also surfaces in a more prominent
place. These arguably-wrong pg_tolower() and pg_toupper() calls were
also used by the normal SQL lower() and upper() functions before commit
e9931bfb751 if you used a single byte encoding.

For example, in PG17, multi-byte encoding:

initdb --locale=tr_TR.utf8

select upper('hij'); --> HİJ

PG17, single-byte encoding:

initdb --locale=tr_TR # uses LATIN5

select upper('hij'); --> HIJ

With current master, after commit e9931bfb751, you get the first result
in both cases.

So this could break indexes across pg_upgrade in such configurations.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-08 21:18:33 Re: tiny step toward threading: reduce dependence on setlocale()
Previous Message Nathan Bossart 2024-08-08 20:01:20 Re: Restart pg_usleep when interrupted