| From: | Andreas Karlsson <andreas(at)proxel(dot)se> |
|---|---|
| To: | Jeff Davis <pgsql(at)j-davis(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Remaining dependency on setlocale() |
| Date: | 2024-08-28 16:26:04 |
| Message-ID: | 5e005b43-4150-4f55-b6f8-c6951ccf979f@proxel.se |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 8/9/24 8:24 PM, Jeff Davis wrote:
> On Fri, 2024-08-09 at 13:41 +0200, Andreas Karlsson wrote:
>> I am leaning towards that we should write our own pure ascii
>> functions
>> for this.
>
> That makes sense for a lot of call sites, but it could cause breakage
> if we aren't careful.
>
>> Since we do not support any non-ascii compatible encodings
>> anyway I do not see the point in having locale support in most of
>> these
>> call-sites.
>
> An ascii-compatible encoding just means that the code points in the
> ascii range are represented as ascii. I'm not clear on whether code
> points in the ascii range can return different results for things like
> isspace(), but it sounds plausible -- toupper() can return different
> results for 'i' in tr_TR.
>
> Also, what about the values outside 128-255, which are still valid
> input to isspace()?
My idea was that in a lot of those cases we only try to parse e.g. 0-9
as digits and always only . as the decimal separator so we should make
just make that obvious by either using locale C or writing our own ascii
only functions. These strings are meant to be read by machines, not
humans, primarily.
Andreas
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jacob Champion | 2024-08-28 16:31:07 | Re: [PoC] Federated Authn/z with OAUTHBEARER |
| Previous Message | Matthias van de Meent | 2024-08-28 15:59:32 | Re: Reading all tuples in Index Access Method |