From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | john(at)geeknet(dot)com(dot)au |
Cc: | alvherre(at)dcc(dot)uchile(dot)cl, pgman(at)candle(dot)pha(dot)pa(dot)us, girgen(at)pingpong(dot)net, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Patch for collation using ICU |
Date: | 2005-05-08 05:41:17 |
Message-ID: | 20050508.144117.108759134.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Alvaro Herrera wrote:
> > Sent: Sunday, May 08, 2005 2:49 PM
> > To: John Hansen
> > Cc: Tatsuo Ishii; pgman(at)candle(dot)pha(dot)pa(dot)us;
> > girgen(at)pingpong(dot)net; pgsql-hackers(at)postgresql(dot)org
> > Subject: Re: [HACKERS] Patch for collation using ICU
> >
> > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
> > > Tatsuo Ishii wrote:
> >
> > > > So Japanese(including ASCII)/UNICODE behavior is
> > perfectly correct
> > > > at this moment.
> > >
> > > Right, so you _never_ use accented ascii characters in Japanese?
> > > (like è for example, whose uppercase is È)
> >
> > That isn't ASCII. It's latin1 or some other ASCII extension.
>
> Point taken...
> But...
>
> If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, not UTF-8 (unicode).
> UTF-8 encoded databases are very useful for representing multiple languages in the same database,
> but this usefulness vanishes if functions like upper/lower doesn't work correctly.
I'm just curious if Germany/French/Spanish mixed text can be sorted
correctly. I think these languages need their own locales even with
UNICODE/ICU.
> So optimizing for 3 languages breaks more than a hundred, that's doesn't seem fair!
Why don't you add a GUC variable or some such to control the
upper/lower behavior?
--
Tatsuo Ishii
From | Date | Subject | |
---|---|---|---|
Next Message | Marc G. Fournier | 2005-05-08 06:17:59 | Re: Can we get patents? |
Previous Message | Tatsuo Ishii | 2005-05-08 05:30:45 | Re: [HACKERS] Invalid unicode in COPY problem |