Re: Remaining dependency on setlocale()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remaining dependency on setlocale()
Date: 2024-08-07 21:40:56
Message-ID: CA+hUKGJtpMwBjrcZMMRCduj7ER+doehuj2i3dN5mC69=DdHkAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2024 at 6:18 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Aug 7, 2024 at 1:29 PM Joe Conway <mail(at)joeconway(dot)com> wrote:
> > FWIW I see all of these in glibc:
> >
> > isalnum_l, isalpha_l, isascii_l, isblank_l, iscntrl_l, isdigit_l,
> > isgraph_l, islower_l, isprint_l, ispunct_l, isspace_l, isupper_l,
> > isxdigit_l
>
> On my MacBook (Ventura, 13.6.7), I see all of these except for isascii_l.

Those (except isascii_l) are from POSIX 2008[1]. They were absorbed
from "Extended API Set Part 4"[2], along with locale_t (that's why
there is a header <xlocale.h> on a couple of systems even though after
absorption they are supposed to be in <locale.h>). We already
decided that all computers have that stuff (commit 8d9a9f03), but the
reality is a little messier than that... NetBSD hasn't implemented
uselocale() yet[3], though it has a good set of _l functions. As
discussed in [3], ECPG code is therefore currently broken in
multithreaded clients because it's falling back to a setlocale() path,
and I think Windows+MinGW must be too (it lacks
HAVE__CONFIGTHREADLOCALE), but those both have a good set of _l
functions. In that thread I tried to figure out how to use _l
functions to fix that problem, but ...

The issue there is that we have our own snprintf.c, that implicitly
requires LC_NUMERIC to be "C" (it is documented as always printing
floats a certain way ignoring locale and that's what the callers there
want in frontend and backend code, but in reality it punts to system
snprintf for floats, assuming that LC_NUMERIC is "C", which we
configure early in backend startup, but frontend code has to do it for
itself!). So we could use snprintf_l or strtod_l instead, but POSIX
hasn't got those yet. Or we could use own own Ryu code (fairly
specific), but integrating Ryu into our snprintf.c (and correctly
implementing all the %... stuff?) sounds like quite a hard,
devil-in-the-details kind of an undertaking to me. Or maybe it's
easy, I dunno. As for the _l functions, you could probably get away
with "every computer has either uselocale() or snprintf_() (or
strtod_()?)" and have two code paths in our snprintf.c. But then we'd
also need a place to track a locale_t for a long-lived newlocale("C"),
which was too messy in my latest attempt...

[1] https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/functions/isspace.html
[2] https://pubs.opengroup.org/onlinepubs/9699939499/toc.pdf
[3] https://www.postgresql.org/message-id/flat/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-08-07 22:45:19 Re: Remaining dependency on setlocale()
Previous Message Nathan Bossart 2024-08-07 21:39:57 Re: New GUC autovacuum_max_threshold ?