Quick Links

Re: BUG #13440: unaccent does not remove all diacritics

From:	Léonard Benedetti <benedetti(at)mlpo(dot)fr>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject:	Re: BUG #13440: unaccent does not remove all diacritics
Date:	2016-03-10 14:35:00
Message-ID:	56E18614.90003@mlpo.fr
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Le 10/03/2016 14:46, Teodor Sigaev a écrit :
>>
>> On the other hand, UTF-8 encoding for source code is *a feature of
>> Python 3* (to quote the documentation: “The default encoding for Python
>> source code is UTF-8”) so there is no possible ambiguity, and it will
>> not be a problem. That said, some non-ASCII characters may be removed
>> without prejudice from the source code of the script (I think in
>> particular to "“" and "”"). Nevertheless, for some comments, it would be
>> unfortunate (e.g. “# RegEx to parse rules (e.g. “Đ → D ; […]”)” or “# ℃
>> °C”).
> Ok, I didn't know that.
>
>
>> Thus, I propose to adapt the code to Python 3 (the encoding of the
>> script does not seem to be a problem for the above reasons). I try to do
>> it shortly.
> We are waiting...
>
Sorry for the delay, adaptation to Python 3 was very easy (the code is
almost identical).

As usual, you will find attached the new version of the script and the
generated output for convenience.

Léonard Benedetti

Attachment	Content-Type	Size
contrib_unaccent_generate_unaccent_rules.py	text/x-python	8.9 KB
unaccent.rules	text/plain	6.2 KB

In response to

Re: BUG #13440: unaccent does not remove all diacritics at 2016-03-10 13:46:21 from Teodor Sigaev

Responses

Re: BUG #13440: unaccent does not remove all diacritics at 2016-03-10 14:44:17 from Léonard Benedetti

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Lucas Souza Cruz	2016-03-10 14:44:06	Performance Improvement in SQL
Previous Message	Teodor Sigaev	2016-03-10 13:46:21	Re: BUG #13440: unaccent does not remove all diacritics