From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: fixing tsearch locale support |
Date: | 2024-12-13 06:16:22 |
Message-ID: | 563dac99-5848-4126-ba7c-752820369da8@eisentraut.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12.12.24 19:14, Jeff Davis wrote:
> On Mon, 2024-12-02 at 11:57 +0100, Peter Eisentraut wrote:
>> t_isdigit() and t_isspace() are just used to parse various
>> configuration
>> and data files, and surely we don't need support for encoding-
>> dependent
>> multibyte support for parsing ASCII digits and ASCII spaces.
>> ... So these can
>> be
>> replaced by the normal isdigit() and isspace().
>
> That would still call libc, and still depend on LC_CTYPE. Should we use
> pure ASCII variants?
isdigit() and isspace() in particular are widely used throughout the
backend code without such concerns. I think the assumption is that this
is not a problem in practice: For multibyte encodings, these functions
would only be able to process the ASCII subset, and the character
classification of that should be consistent across all locales. For
single-byte encodings, among the encodings that PostgreSQL supports, I
don't think any of them actually provide non-ASCII digits or space
characters.
From | Date | Subject | |
---|---|---|---|
Next Message | Önder Kalacı | 2024-12-13 06:35:45 | Re: Allow FDW extensions to support MERGE command via CustomScan |
Previous Message | Shubham Khanna | 2024-12-13 05:32:33 | Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility. |