From: | Francisco Olarte <folarte(at)peoplecall(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, shailesh(dot)totale(at)sailpoint(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド' |
Date: | 2023-11-29 08:12:45 |
Message-ID: | CA+bJJbw6n7Zx2XdmFEGv6dmXCFu6VpVbsfU7whsqkhwk7XCerw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi Jeff:
On Wed, 29 Nov 2023 at 03:40, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
I am not going to generally discuss this:
> But isn't it generally the case that removing accents might make you land on a different word with a different meaning?
But this one is a bad example,
> 'ano' and 'año' for example mean different things in Spanish (but unaccent removes it anyway, at least in one out of four attempts to get the non-7-bit-ASCII wedged through my terminal and into the function).
N and Ñ are different letters in spanish. It looks like an accent, can
be typed as such and some unaccent rules in some programs may make
them equal, Ñ is as different from N as it is from Z ( I am spanish,
and in case you want some authority link see
https://www.rae.es/dpd/%C3%B1 ). It has it own pages in the dictionary
( even on paper, I just checked in case my memory fails ).
We used to have also CH and LL as letters, but they were dropped
"recently" ( that meaning this century, I'm getting old ).
On the other "accents", à,è,ì,ò, ù can generally be unaccented w/o
problem, although they may change meaning in some corner cases I do
not remember seen them do that since the special examples in school.
Other thing is ü, which is used on our "special" handling of hard/soft
vowels after g, i.e., you do not pronounce the u in "reguero" ( bot
modify how you pronounce the g, differently from agente ), but in
"agüero" you do pronounce it.
But Ñ is a proper letter, you cannot break it. Our alphabet goes m-n-ñ-o-p-q.
Francisco Olarte.
P.S. to really sound spanish, we would have picked up "cono" for the
examples :-p
FO
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2023-11-29 08:45:09 | Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド' |
Previous Message | zhihuifan1213 | 2023-11-29 07:50:00 | Re: 回复: BUG #18213: Standby's repeatable read isolation level transaction encountered a "nonrepeatable read" problem |