From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Thread-safe nl_langinfo() and localeconv() |
Date: | 2024-08-16 00:48:07 |
Message-ID: | CA+hUKGLqLxaNXN8mSqetYpD0QHWimHWZbi6KOfP=hfRvBfA9iw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 15, 2024 at 11:11 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> There's a similar function in initdb, check_locale_name. I wonder if
> that could reuse the same code.
Thanks, between this comment and some observations from Peter E and
Tom, I think I have a better plan now. I think they should *not*
match, and a comment saying so should be deleted. In the backend, we
should do neither ""-expansion (ie getenv("LC_...") whether direct or
indirect) nor canonicalisation (of Windows' deprecated pre-BCP 47
locale names), making this v4 patch extremely simple.
1. CREATE DATABASE doesn't really need to accept LOCALE = ''. What
is the point? It's not documented or desirable behavior AFAIK. If
you like defaults you can just not provide a locale at all and get the
template database's (which often comes from initdb, which often uses
the server environment). That behavior was already inconsistent with
CREATE COLLATION. So I think we should just reject "" in the backend
check_locale().
2. A similar argument applies to Windows canonicalisation. CREATE
COLLATION isn't doing it. CREATE DATABASE is, but again, what is the
point? See previous.
(I also think that initdb should use a different mechanism for finding
the native locale on Windows, but I already have a CF #3772 for that,
ie migration plan for BCP 47 and native UTF-8 on Windows, but I don't
want *this* thread to get blocked by our absence of Windows
reviewers/testers, so let's not tangle that up with this thread-safety
expedition.)
To show a concrete example of commands no longer accepted with this
version, because they call check_locale():
postgres=# set lc_monetary = '';
ERROR: invalid value for parameter "lc_monetary": ""
postgres=# create database db2 locale = '';
ERROR: invalid LC_COLLATE locale name: ""
HINT: If the locale name is specific to ICU, use ICU_LOCALE.
Does anyone see a problem with that?
I do see a complication for CREATE COLLATION, though. It doesn't call
check_locale(), is not changed in this patch, and still accepts ''.
Reasoning: There may be systems with '' in their pg_collation catalog
in the wild, since we never canonicalised with setlocale(), so it
might create some kind of unforeseen dump/restore/upgrade hazard if we
just ban '' outright, I just don't know what yet.
There is no immediate problem, ie there is no setlocale() to excise,
for *this* project. Longer term, we can't actually continue to allow
'' in COLLATION objects, though: that tells newlocale() to call
getenv(), which may be technically OK in a multi-threaded program
(that never calls setenv()), but is hardly desirable. But it will
also give the wrong result, if we pursue the plan that Jeff and I
discussed: we'll stop doing setenv("LC_COLLATE", datacollate) and
setenv("LC_CTYPE", datctype) in postinit.c (see pg_perm_setlocale()
calls). So I think any pg_collation catalog entries holding '' need
to be translated to datcollate/datctype... somewhere. I just don't
know where yet and don't want to tackle that in the same patch.
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Provide-thread-safe-pg_localeconv_r.patch | text/x-patch | 22.5 KB |
v4-0002-Use-thread-safe-strftime_l-instead-of-strftime.patch | text/x-patch | 7.9 KB |
v4-0003-Remove-setlocale-from-check_locale.patch | text/x-patch | 10.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Corey Huinker | 2024-08-16 00:53:37 | Re: Statistics Import and Export |
Previous Message | Alvaro Herrera | 2024-08-16 00:40:42 | Re: define PG_REPLSLOT_DIR |