From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Tasos Maschalidis <TaS(dot)O(dot)S(at)hotmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15347: Unaccent for greek characters does not work |
Date: | 2018-08-23 22:16:14 |
Message-ID: | CAEepm=0RUhOuvQs2LQnFYzR4GWHtn6wUT9UaKi+vC0erKW4=dw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, Aug 24, 2018 at 12:22 AM, Tasos Maschalidis <TaS(dot)O(dot)S(at)hotmail(dot)com> wrote:
> return (codepoint.id >= ord('a') and codepoint.id <= ord('z')) or \
> (codepoint.id >= ord('A') and codepoint.id <= ord('Z')) or \
>
> (codepoint.id >= ord('α') and codepoint.id <= ord('ω')) or \
> (codepoint.id >= ord('Α') and codepoint.id <= ord('Ω'))
Thank you. Here it is in the form of a patch that I propose to commit
to PostgreSQL 12. It adds 221 lines to unaccent.rules. They look
sane to my untrained eye. Do you agree?
Example of use:
postgres=# select unaccent('Θέμα: Re: BUG #15347: Unaccent for greek ...');
unaccent
----------------------------------------------
Θεμα: Re: BUG #15347: Unaccent for greek ...
(1 row)
I wondered if the documentation might need a change, but it already
says something broad enough: "A more complete example, which is
directly useful for most European languages, can be found in
unaccent.rules, ...".
--
Thomas Munro
http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0001-Add-Greek-characters-to-unaccent.rules.patch | application/octet-stream | 4.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tasos Maschalidis | 2018-08-23 22:47:59 | Re: BUG #15347: Unaccent for greek characters does not work |
Previous Message | Tom Lane | 2018-08-23 15:24:55 | Re: BUG #15342: pg_dump - XML with mixed content types generates invalid backup file |