Re: Thread-safe nl_langinfo() and localeconv()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Thread-safe nl_langinfo() and localeconv()
Date: 2024-08-16 00:48:07
Message-ID: CA+hUKGLqLxaNXN8mSqetYpD0QHWimHWZbi6KOfP=hfRvBfA9iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 15, 2024 at 11:11 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> There's a similar function in initdb, check_locale_name. I wonder if
> that could reuse the same code.

Thanks, between this comment and some observations from Peter E and
Tom, I think I have a better plan now. I think they should *not*
match, and a comment saying so should be deleted. In the backend, we
should do neither ""-expansion (ie getenv("LC_...") whether direct or
indirect) nor canonicalisation (of Windows' deprecated pre-BCP 47
locale names), making this v4 patch extremely simple.

1. CREATE DATABASE doesn't really need to accept LOCALE = ''. What
is the point? It's not documented or desirable behavior AFAIK. If
you like defaults you can just not provide a locale at all and get the
template database's (which often comes from initdb, which often uses
the server environment). That behavior was already inconsistent with
CREATE COLLATION. So I think we should just reject "" in the backend
check_locale().

2. A similar argument applies to Windows canonicalisation. CREATE
COLLATION isn't doing it. CREATE DATABASE is, but again, what is the
point? See previous.

(I also think that initdb should use a different mechanism for finding
the native locale on Windows, but I already have a CF #3772 for that,
ie migration plan for BCP 47 and native UTF-8 on Windows, but I don't
want *this* thread to get blocked by our absence of Windows
reviewers/testers, so let's not tangle that up with this thread-safety
expedition.)

To show a concrete example of commands no longer accepted with this
version, because they call check_locale():

postgres=# set lc_monetary = '';
ERROR: invalid value for parameter "lc_monetary": ""

postgres=# create database db2 locale = '';
ERROR: invalid LC_COLLATE locale name: ""
HINT: If the locale name is specific to ICU, use ICU_LOCALE.

Does anyone see a problem with that?

I do see a complication for CREATE COLLATION, though. It doesn't call
check_locale(), is not changed in this patch, and still accepts ''.
Reasoning: There may be systems with '' in their pg_collation catalog
in the wild, since we never canonicalised with setlocale(), so it
might create some kind of unforeseen dump/restore/upgrade hazard if we
just ban '' outright, I just don't know what yet.

There is no immediate problem, ie there is no setlocale() to excise,
for *this* project. Longer term, we can't actually continue to allow
'' in COLLATION objects, though: that tells newlocale() to call
getenv(), which may be technically OK in a multi-threaded program
(that never calls setenv()), but is hardly desirable. But it will
also give the wrong result, if we pursue the plan that Jeff and I
discussed: we'll stop doing setenv("LC_COLLATE", datacollate) and
setenv("LC_CTYPE", datctype) in postinit.c (see pg_perm_setlocale()
calls). So I think any pg_collation catalog entries holding '' need
to be translated to datcollate/datctype... somewhere. I just don't
know where yet and don't want to tackle that in the same patch.

Attachment Content-Type Size
v4-0001-Provide-thread-safe-pg_localeconv_r.patch text/x-patch 22.5 KB
v4-0002-Use-thread-safe-strftime_l-instead-of-strftime.patch text/x-patch 7.9 KB
v4-0003-Remove-setlocale-from-check_locale.patch text/x-patch 10.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2024-08-16 00:53:37 Re: Statistics Import and Export
Previous Message Alvaro Herrera 2024-08-16 00:40:42 Re: define PG_REPLSLOT_DIR