From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tomi NA <hefest(at)gmail(dot)com>, Martins Mihailovs <martins(dot)mihailovs(at)europrojects(dot)org>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: UTF-8 |
Date: | 2006-10-13 16:58:27 |
Message-ID: | 20061013165827.GN1896@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Oct 13, 2006 at 12:04:02PM -0400, Tom Lane wrote:
> "Tomi NA" <hefest(at)gmail(dot)com> writes:
> > 2006/10/13, Martijn van Oosterhout <kleptog(at)svana(dot)org>:
> >> Similarly, upper/lower are also supported, although postgresql doesn't
> >> take advantage of the system support in that case.
>
> > I think this is the crux of the problem.
>
> If it were true, then it might be ...
Eh? Here's the declaration of pg_toupper:
unsigned char pg_toupper(unsigned char ch);
Characters havn't fitted in an unsigned char in a very long time. It's
obviously bogus for any multibyte encoding (the code even says so). For
such encodings you could use the system's towupper() (ANSI C/Unix98)
which will work on any unicode char.
To make this work, pg_strupper() will have to convert each character to
Unicode, run towupper() and convert back to the encoding. I imagine
that'll get rejected for being inefficient, but really don't see any
other way.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-10-13 17:03:41 | Re: more anti-postgresql FUD |
Previous Message | brian | 2006-10-13 16:47:01 | Re: some log statements ignored |