From: | Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Mohammad Alhashash <alhashash(at)alhashash(dot)net> |
Subject: | Re: PATCH: Allow empty targets in unaccent dictionary |
Date: | 2014-06-30 20:10:39 |
Message-ID: | 20140630201039.GA11973@toroid.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At 2014-06-30 15:19:17 -0400, tgl(at)sss(dot)pgh(dot)pa(dot)us wrote:
>
> Anyway, this raises the question of whether the current patch is
> actually a desirable way to do things, or whether it would be better
> if the unaccenting rules were like "base-char accent-char" ->
> "base-char".
It might be useful to be able to write such rules, but it would be
highly impractical to do so instead of being able to single out
accent-chars for removal.
In all the languages I'm familiar with that use such accent-chars, any
accent-char would form a valid combination with nearly every base-char,
unlike European languages where you don't have to worry about k-umlaut,
say. Also, a standalone accent-char would always be meaningless.
(These accent-chars don't actually exist independently in the syllabary
that a Hindi speaker might learn in school: they're combining forms of
vowels and are treated differently from characters in practice.)
> Also, if there are any contexts where the right translation of an
> accent-char depends on the base-char, you couldn't do it with the
> patch as it stands.
I can't think of a satisfactory example at the moment, but that sounds
entirely plausible.
> It's not unlikely that we want this patch *and* an improvement that
> allows multi-character src strings
I think it's enough to apply just this patch, but I wouldn't object to
doing both if it were easy. It's not clear to me if that's true after a
quick glance at the code, but I'll look again when I'm properly awake.
> Lastly, I didn't especially like the coding details of either proposed
> patch, and rewrote it as attached.
:-)
-- Abhijit
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-06-30 20:16:04 | Re: better atomics - v0.5 |
Previous Message | Christian Ullrich | 2014-06-30 19:28:03 | Re: PostgreSQL in Windows console and Ctrl-C |