From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, hugh(at)whtc(dot)ca, daniel(at)manitou-mail(dot)org, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2018-12-18 06:07:35 |
Message-ID: | 20181218060735.GL1532@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Tue, Dec 18, 2018 at 12:36:02AM -0500, Tom Lane wrote:
> tl;dr: I think we should convert unaccent.sql and unaccent.out
> to UTF8 encoding. Then, adding more test cases for this patch
> will be easy.
Do you think that we could also remove the non-ASCII characters from the
tests? It would be easy enough to use E'\xNN' (utf8 hex) or such in
input, and show the output with bytea. That's harder to read, still we
discussed about not using UTF-8 in the python script to allow folks with
simple terminals to touch the code the last time this was touched
(5e8d670) and the characters used could be documented as comments in the
tests.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2018-12-18 06:12:53 | Re: BUG #15552: Unexpected error in COPY to a foreign table in a transaction |
Previous Message | Michael Paquier | 2018-12-18 06:02:43 | Re: BUG #15552: Unexpected error in COPY to a foreign table in a transaction |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-12-18 06:23:57 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Kyotaro HORIGUCHI | 2018-12-18 05:56:00 | Re: don't create storage when unnecessary |