Re: Patch for collation using ICU

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: John Hansen <john(at)geeknet(dot)com(dot)au>
Cc: Palle Girgensohn <girgen(at)pingpong(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch for collation using ICU
Date: 2005-05-07 13:46:39
Message-ID: 200505071346.j47Dkd227665@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

John Hansen wrote:
> > --On l?rdag, maj 07, 2005 22.53.46 +1000 John Hansen
> > <john(at)geeknet(dot)com(dot)au>
> > wrote:
> >
> > > Errm,... initdb --encoding UNICODE --locale C
> >
> > You mean that ICU *shall* be used even for the C locale, and
> > not as Bruce suggested here:
>
> Yes, that's exactly what I mean.

There are two reasons for that optimization --- first, some locale
support is broken and Unicode encoding with a C locale crashes (not an
issue for ICU), and second, it is an optimization for languages like
Japanese that want to use unicode, but don't need a locale because
upper/lower means nothing in those character sets.

So, the first issue doesn't apply for ICU, and the second might not
depending on what characters you are using in the Unicode character set.

I guess I am little confused how ICU can do upper() when the locale is
C. What is it using to determine A is upper for a? Am I confused?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Hansen 2005-05-07 13:49:01 Re: Patch for collation using ICU
Previous Message John Hansen 2005-05-07 13:46:32 Re: Patch for collation using ICU