| From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
|---|---|
| To: | Jeff Davis <pgsql(at)j-davis(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: tiny step toward threading: reduce dependence on setlocale() |
| Date: | 2024-08-08 21:09:53 |
| Message-ID: | b961acfc-341c-4693-b944-2dddc9dcfdc9@eisentraut.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 07.08.24 22:44, Peter Eisentraut wrote:
> (Now that I look at it, pg_tolower() has some short-circuiting for ASCII
> letters, so it would not handle Turkish-i correctly if that had been the
> global locale. By removing the use of pg_tolower(), we fix that issue
> in passing.)
It occurred to me that this issue also surfaces in a more prominent
place. These arguably-wrong pg_tolower() and pg_toupper() calls were
also used by the normal SQL lower() and upper() functions before commit
e9931bfb751 if you used a single byte encoding.
For example, in PG17, multi-byte encoding:
initdb --locale=tr_TR.utf8
select upper('hij'); --> HİJ
PG17, single-byte encoding:
initdb --locale=tr_TR # uses LATIN5
select upper('hij'); --> HIJ
With current master, after commit e9931bfb751, you get the first result
in both cases.
So this could break indexes across pg_upgrade in such configurations.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2024-08-08 21:18:33 | Re: tiny step toward threading: reduce dependence on setlocale() |
| Previous Message | Nathan Bossart | 2024-08-08 20:01:20 | Re: Restart pg_usleep when interrupted |