Re: Remaining dependency on setlocale()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remaining dependency on setlocale()
Date: 2024-08-12 03:24:01
Message-ID: CA+hUKGLZ8A_j+UQidb2yKeUQm_FOYz3HgZRES21YOOZBT88pZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 8, 2024 at 5:16 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Wed, 2024-08-07 at 19:07 +1200, Thomas Munro wrote:
> > How far can we get by using more _l() functions?
>
> There are a ton of calls to, for example, isspace(), used mostly for
> parsing.
>
> I wouldn't expect a lot of differences in behavior from locale to
> locale, like might be the case with iswspace(), but behavior can be
> different at least in theory.
>
> So I guess we're stuck with setlocale()/uselocale() for a while, unless
> we're able to move most of those call sites over to an ascii-only
> variant.

Here are two more cases that I don't think I've seen discussed.

1. The nl_langinfo() call in pg_get_encoding_from_locale(), can
probably be changed to nl_langinfo_l() (it is everywhere we currently
care about except Windows, which has a different already-thread-safe
alternative; AIX seems to lack the _l version, but someone writing a
patch to re-add support for that OS could supply the configure goo for
a uselocale() safe/restore implementation). One problem is that it
has callers that pass it NULL meaning the backend default, but we'd
perhaps use LC_C_GLOBAL for now and have to think about where we get
the database default locale_t in the future.

2. localeconv() is *doubly* non-thread-safe: it depends on the
current locale, and it also returns an object whose storage might be
clobbered by any other call to localeconv(), setlocale, or even,
according to POSIX, uselocale() (!!!). I think that effectively
closes off that escape hatch. On some OSes (macOS, BSDs) you find
localeconv_l() and then I think they give you a more workable
lifetime: as long as the locale_t lives, which makes perfect sense. I
am surprised that no one has invented localeconv_r() where you supply
the output storage, and you could wrap that in uselocale()
save/restore to deal with the other problem, or localeconv_r_l() or
something. I can't understand why this is so bad. The glibc
documentation calls it "a masterpiece of poor design". Ahh, so it
seems like we need to delete our use of localeconf() completely,
because we should be able to get all the information we need from
nl_langinfo_l() instead:

https://www.gnu.org/software/libc/manual/html_node/Locale-Information.html

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2024-08-12 03:32:32 RE: [BUG?] check_exclusion_or_unique_constraint false negative
Previous Message Peter Smith 2024-08-12 03:20:04 Re: Logical Replication of sequences