From: | Benedikt Eric Heinen <beh(at)icemark(dot)ch> |
---|---|
To: | Patrice Hédé <patrice(at)idf(dot)net> |
Cc: | pgsql-sql(at)postgreSQL(dot)org |
Subject: | Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents) |
Date: | 1998-06-17 19:55:31 |
Message-ID: | Pine.LNX.3.96.980617210549.30824C-100000@fenun.icemark.ch |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
> Do you mean you have a field with German *and* French *and* Italian *and*
> English words in it, and you want people, be they german-, french-,
> italian-, english-speaking, to be able to access this field, without
> putting accents and all ?
Right - basically, I am building a web database with addresses of a group
of people all over Switzerland, who are members of the same club. The
problem is just, that for a Mr. "á Porta" I (can't speak French or
Italian) doesn't know what the right spelling with accents is. Which is
much the same way that a French native speaker of the western part of
Switzerland possibly doesn't know which/whether an Umlaut will have to be
used in a German name...
> As I said earlier, you may have problems, since `ae' doesn't mean `ä' for
> most of these people (except the german-speaking ones), and they may put
> `a' instead. As the rules are different among the languages, it's
> difficult to have a single solution. However, you *need* a solution.
> Maybe I, or others ;) , may help though. Some questions : what is your
> interface language (if it's perl, it can be much easier :) ) ? Can it be a
> client-side solution, or do you absolutely need a server-side one (which
> would then have to be a C function, I think) ?
The program is a server-side C++ CGI (Can't program perl).
I just thought - I am certainly not the first to have had this kind of
problem...
> And then, what kind of conversions do you need ? For example, for French,
> I decided that all a, e, i, o, u, y to be equal, which meant :
>
> any of a,A,à,À,æ,Æ,å,Å,â,Â,á,Á,ä,Ä => a,A,à,À,æ,Æ,å,Å,â,Â,á,Á,ä,Ä
> etc.
Let's say - only just the search string should ever be modified, so an "ä"
in the search string should never match "ae" in a string in the database.
The modifications should be:
part of search string can match in database side string
a a, a umlaut,
a with acute/grave/circumflex accent
ae ae, a umlaut
c c, c cedilla
e e, e with acute/grave/circumflex accent
i i, i with acute/grave/circumflex accent
o o, o umlaut,
o with acute/grave/circumflex accent
oe oe, o umlaut
u u, u umlaut,
u with acute/grave/circumflex accent
ue ue, u umlaut
[all searches will be case insensitive]
> Obviously, in your case, it will be more complex, since `ae' *may* have a
> special meaning... (that's where it's getting difficult :( )...
I hope the above description is somewhat useful to you (unfortunately I am
lacking the matching characters on my US keyboard - so I described which
ones should be matched).
I guess, the ideal way would be to try and build a general pluggable
module for postgresql, so that it can handle this somewhat transparently.
Benedikt
Windows 95: n.
32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit
operating system originally coded for a 4-bit microprocessor, written
by a 2-bit company that can't stand for 1 bit of competition.
From | Date | Subject | |
---|---|---|---|
Next Message | Pich LY | 1998-06-18 16:57:09 | FOREIGN KEY ... |
Previous Message | Patrice Hédé | 1998-06-17 16:22:16 | Re: [SQL] Internationalisation: SELECT str (ignoring Umlauts/Accents) |