From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, iksss(dot)88(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15805: Problem with lower function for greek sigma (Σ) letter |
Date: | 2019-05-15 13:44:20 |
Message-ID: | 31390.1557927860@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2019-May-15, Daniel Gustafsson wrote:
>> This is indeed a bug, and a rare occurrence since AFAICT from ISO 30112 and
>> googling there is only a single case of word-final lowercasing which is this
>> sigma. The attached patch takes a stab at fixing this.
> Ummm ... isn't this a counterexample?
> https://hebrew4christians.com/Grammar/Unit_One/Final_Forms/final_forms.html
I do not think the patch as given is acceptable in any case:
1. assumes without any evidence whatsoever that the system's wide-character
representation is Unicode code points;
2. assumes without checking that the locale is one that would allow this
conversion (counterexample: C locale);
3. unreasonable hard-coded assumption about what the "not a word character"
condition is.
It's possible that 1 and 2 could be finessed by checking both that the
original character is Σ and the new one is σ (in Unicode). We'd still
theoretically be taking a risk of the wrong substitution if the wchar
representation is not Unicode, but the odds seem fairly small. As for
point 3, why aren't you using iswalpha() on the next character?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-05-15 14:39:49 | Re: BUG #15804: Assertion failure when using logging_collector with EXEC_BACKEND |
Previous Message | Alvaro Herrera | 2019-05-15 12:20:48 | Re: BUG #15805: Problem with lower function for greek sigma (Σ) letter |