Re: Remaining dependency on setlocale()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remaining dependency on setlocale()
Date: 2024-08-12 04:53:17
Message-ID: CA+hUKGKf16vjpe4Q28tHQx_hN=OupPUoUNpDsijByMBz+_89dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 12, 2024 at 3:24 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> 1. The nl_langinfo() call in pg_get_encoding_from_locale(), can
> probably be changed to nl_langinfo_l() (it is everywhere we currently
> care about except Windows, which has a different already-thread-safe
> alternative ...

... though if we wanted to replace all use of localeconv and struct
lconv with nl_langinfo_l() calls, it's not totally obvious how to do
that on Windows. Its closest thing is GetLocaleInfoEx(), but that has
complications: it takes wchar_t locale names, which we don't even have
and can't access when we only have a locale_t, and it must look them
up in some data structure every time, and it copies data out to the
caller as wchar_t so now you have two conversion problems and a
storage problem. If I understand correctly, the whole point of
nl_langinfo_l(item, loc) is that it is supposed to be fast, it's
really just an array lookup, and item is just an index, and the result
is supposed to be stable as long as loc hasn't been freed (and the
thread hasn't exited). So you can use it without putting your own
caching in front of it. One idea I came up with which I haven't tried
and it might turn out to be terrible, is that we could change our
definition of locale_t on Windows. Currently it's a typedef to
Windows' _locale_t, and we use it with a bunch of _XXX functions that
we access by macro to remove the underscore. Instead, we could make
locale_t a pointer to a struct of our own design in WIN32 builds,
holding the native _locale_t and also an array full of all the values
that nl_langinfo_l() can return. We'd provide the standard enums,
indexes into that array, in a fake POSIX-oid header <langinfo.h>.
Then nl_langinfo_l(item, loc) could be implemented as
loc->private_langinfo[item], and strcoll_l(.., loc) could be a static
inline function that does _strcol_l(...,
loc->private_windows_locale_t). These structs would be allocated and
freed with standard-looking newlocale() and freelocale(), so we could
finally stop using #ifdef WIN32-wrapped _create_locale() directly.
Then everything would look more POSIX-y, nl_langinfo_l() could be used
directly wherever we need fast access to that info, and we could, I
think, banish the awkward localeconv, right? I don't know if this all
makes total sense and haven't tried it, just spitballing here...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-08-12 05:09:47 Re: Logical Replication of sequences
Previous Message Peter Smith 2024-08-12 04:28:55 Re: Logical Replication of sequences