Quick Links

Re: UTF-8

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc:	Tomi NA <hefest(at)gmail(dot)com>, Martins Mihailovs <martins(dot)mihailovs(at)europrojects(dot)org>, pgsql-general(at)postgresql(dot)org
Subject:	Re: UTF-8
Date:	2006-10-13 17:22:22
Message-ID:	3001.1160760142@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> Characters havn't fitted in an unsigned char in a very long time. It's
> obviously bogus for any multibyte encoding (the code even says so). For
> such encodings you could use the system's towupper() (ANSI C/Unix98)
> which will work on any unicode char.

http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/oracle_compat.c?rev=1.67

* If the system provides the needed functions for wide-character manipulation
* (which are all standardized by C99), then we implement upper/lower/initcap
* using wide-character functions. Otherwise we use the traditional <ctype.h>
* functions, which of course will not work as desired in multibyte character
* sets. Note that in either case we are effectively assuming that the
* database character encoding matches the encoding implied by LC_CTYPE.

regards, tom lane

In response to

Re: UTF-8 at 2006-10-13 16:58:27 from Martijn van Oosterhout

Browse pgsql-general by date

	From	Date	Subject
Next Message	Shane Ambler	2006-10-13 17:22:53	Re: Server Added Y'day. Missing Today
Previous Message	Tom Lane	2006-10-13 17:16:36	Re: some log statements ignored