From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Hugh Ranalli <hugh(at)whtc(dot)ca> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, thomas(dot)munro(at)enterprisedb(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2018-12-15 21:20:11 |
Message-ID: | 29419.1544908811@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Hugh Ranalli <hugh(at)whtc(dot)ca> writes:
> The problem is that I downloaded the latest version of the Latin-ASCII
> transliteration file (r34 rather than the r28 specified in the URL). Over 3
> years ago (in r29, of course) they changed the file format (
> https://unicode.org/cldr/trac/ticket/5873) so that
> parse_cldr_latin_ascii_transliterator loads an empty rules set.
Ah-hah.
> I'd be
> happy to either a) support both formats, or b), support just the newest and
> update the URL. Option b) is cleaner, and I can't imagine why anyone would
> want to use an older rule set (then again, struggling with Unicode always
> makes my head hurt; I am not an expert on it). Thoughts?
(b) seems sufficient to me, but perhaps someone else has a different
opinion.
Whichever we do, I think it should be a separate patch from the feature
addition for combining diacriticals, just to keep the commit history
clear.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2018-12-16 02:26:20 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Hugh Ranalli | 2018-12-15 21:03:33 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
From | Date | Subject | |
---|---|---|---|
Next Message | Ron | 2018-12-15 22:01:13 | Re: simple query on why a merge join plan got selected |
Previous Message | Hugh Ranalli | 2018-12-15 21:03:33 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |