From: | Hannu Krosing <hannu(at)tm(dot)ee> |
---|---|
To: | Alexey Mahotkin <alexm(at)w-m(dot)ru> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: UPPER()/LOWER() and UTF-8 |
Date: | 2003-11-09 21:56:51 |
Message-ID: | 1068415011.4486.6.camel@fuji.krosing.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alexey Mahotkin kirjutas K, 05.11.2003 kell 17:11:
> Aha, that's in src/backend/utils/adt/formatting.c, right?
>
> Yes, I see, it goes byte by byte and uses toupper(). I believe we
> could look at the locale, and if it is UTF-8, then use (or copy)
> e.g. g_utf8_strup/strdown, right?
>
> http://developer.gnome.org/doc/API/2.0/glib/glib-Unicode-Manipulation.html#g-utf8-strup
>
> I belive that patch could be written in a matter of hours.
>
>
> TL> There has been some discussion of using <wctype.h> where
> TL> available, but this has a number of issues, notably figuring
> TL> out the correct mapping from the server string encoding (eg
> TL> UTF-8) to unpacked wide characters. At minimum we'd need to
> TL> know which charset the locale setting is expecting, and there
> TL> doesn't seem to be a portable way to find that out.
>
> TL> IIRC, Peter thinks we must abandon use of libc's locale
> TL> functionality altogether and write our own locale layer before
> TL> we can really have all the locale-specific functionality we
> TL> want.
>
> I believe that native Unicode strings (together with human language
> handling) should be introduced as (almost) separate data type (which
> have nothing to do with locale), but that's bluesky maybe.
They should have nothing to do with _system_ locale, but you can
neither UPPER()/LOWER() nor ORDER BY unless you know the locale. It is
just that the locale should either be property of column or given in the
SQL statement.
I guess one could write UCHAR, UVARCHAR, UTEXT types based on ICU.
-------------
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2003-11-09 21:57:28 | Re: BTree index |
Previous Message | Hannu Krosing | 2003-11-09 21:51:25 | Re: possible replace() bug - postgres 7.3.1 |