From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Hugh Ranalli <hugh(at)whtc(dot)ca> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, thomas(dot)munro(at)enterprisedb(dot)com, Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2019-01-09 03:52:53 |
Message-ID: | 20190109035253.GF21835@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Hi Hugh,
On Fri, Jan 04, 2019 at 11:29:42AM -0500, Hugh Ranalli wrote:
> I think we're on the same page. I'll wait for you to finish your review and
> provide any further comments before I make any changes.
I have been doing a bit more than a review by studying by myself the
new format and the old format, and the way we could do things in the
XML parsing part, and hacked the code by myself. On top of the
incorrect URL for Latin-ASCII.xml, I have noticed as well that there
should be only one block transforms/transform/tRule in the source, so
I think that we should add an assertion on that as a sanity check. I
have also changed the code to use splitlines(), which is more portable
across platforms, and added an extra regression test for the new
characters added to unaccent.rules. This does not close this thread
but we can support the new format this way. I have also documented
the way to browse the full set of releases for Latin-ASCII.xml, and
precisely which version has been used for this patch.
This does not close yet the part for diacritical characters, but
supporting the new format is a step into this direction. What do
you think?
--
Michael
Attachment | Content-Type | Size |
---|---|---|
unaccent-format-update.patch | text/x-diff | 3.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2019-01-09 04:01:20 | Re: BUG #15577: Query returns different results when executed multiple times |
Previous Message | PG Bug reporting form | 2019-01-09 02:21:48 | BUG #15582: ALTER TABLE/INDEX ... SET TABLESPACE does not free disk space |
From | Date | Subject | |
---|---|---|---|
Next Message | Ryan Lambert | 2019-01-09 03:56:24 | Re: Installation instructions update (pg_ctl) |
Previous Message | Amit Kapila | 2019-01-09 03:24:48 | Re: New function pg_stat_statements_reset_query() to reset statistics of a specific query |