| From: | Hannu Krosing <hannu(at)tm(dot)ee> |
|---|---|
| To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
| Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Alexey Mahotkin <alexm(at)hsys(dot)msk(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Proper Unicode support |
| Date: | 2003-08-12 22:18:00 |
| Message-ID: | 1060726680.2318.40.camel@fuji.krosing.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Oleg Bartunov kirjutas E, 11.08.2003 kell 11:52:
> On Mon, 11 Aug 2003, Peter Eisentraut wrote:
>
> > Alexey Mahotkin writes:
> >
> > > AFAIK, currently the codepoints are sorted in their numerical order. I've
> > > searched the source code and could not find the actual place where this is
> > > done. I've seen executor/nodeSort.c and utils/tuplesort.c. AFAIU, they
> > > are generic sorting routines.
> >
> > PostgreSQL uses the operating system's locale routines for this. So the
> > sort order depends on choosing a locale that can deal with Unicode.
> >
>
> sort order works, but upper/lower are broken.
I think that the original MB/Unicode support was made for japanese
language/characters, and AFAIK they don't even have the concept
(problem) of upper/lower case.
A question to the core - are there any plans to rectify this for less
fortunate languages/charsets?
Will the ASCII-speaking core tolerate the potential loss of performance
from locale-aware upper/lower ?
Will this be considered a feature or a bugfix (i.e. should we attempt to
fix it for 7.4) ?
---------------
Hannu
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hannu Krosing | 2003-08-12 22:25:17 | Re: TODO items |
| Previous Message | Joe Conway | 2003-08-12 22:17:08 | Re: Parsing speed (was Re: pgstats_initstats() cost) |