From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | hugh(at)whtc(dot)ca, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, daniel(at)manitou-mail(dot)org, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2018-12-18 04:57:08 |
Message-ID: | 20181218045708.GI1532@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Tue, Dec 18, 2018 at 03:05:00PM +1100, Thomas Munro wrote:
> I don't think this is quite right. Those don't seem to be the
> combining codepoints[1], and in any case they are being replaced with
> ASCII characters, whereas I thought we wanted to replace them with
> nothing at all. Here is my attempt to come up with a test case using
> combining characters:
>
> select unaccent('un café crème s''il vous plaît');
>
> It's not stripping the accents. I've attached that in a file for
> reference so you can run it with psql -f x.sql, and you can see that
> it's using combining code points (code points 0301, 0300, 0302 which
> come out as cc81, cc80, cc82 in UTF-8) like so:
Could you also add some tests in contrib/unaccent/sql/unaccent.sql at
the same time? That would be nice to check easily the extent of the
patches proposed on this thread.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2018-12-18 05:04:19 | Re: BUG #15552: Unexpected error in COPY to a foreign table in a transaction |
Previous Message | Thomas Munro | 2018-12-18 04:10:25 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-12-18 05:36:02 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Amit Kapila | 2018-12-18 04:52:52 | Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query |