Re: [PATCH] Completed unaccent dictionary with many missing characters

From: Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: [PATCH] Completed unaccent dictionary with many missing characters
Date: 2022-05-05 19:44:15
Message-ID: ee5e0b6f-2a1c-a15c-041e-70208d4a0d86@sztoch.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote on 5/4/2022 5:32 PM:
> Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
>> On 28.04.22 18:50, Przemysław Sztoch wrote:
>>> Current unnaccent dictionary does not include many popular numeric symbols,
>>> in example: "m²" -> "m2"
>> Seems reasonable.
> It kinda feels like this is outside the charter of an "unaccent"
> dictionary. I don't object to having these conversions available
> but it seems like it ought to be a separate feature.
>
> regards, tom lane
Tom, I disagree with you because many similar numerical conversions are
already taking place, e.g. 1/2, 1/4...

Today Unicode is ubiquitous and we use a lot more weird characters.
I just completed these less common characters.

Therefore, the problem of missing characters in unaccent.rules affects
the correct operation of the FTS mechanisms.
--
Przemysław Sztoch | Mobile +48 509 99 00 66

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Przemysław Sztoch 2022-05-05 22:08:22 Re: Re: Add --{no-,}bypassrls flags to createuser
Previous Message Przemysław Sztoch 2022-05-05 19:40:09 Re: [PATCH] Completed unaccent dictionary with many missing characters