Re: lower and upper not UTF-8 safe

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
Cc: Julian Satchell <j(dot)satchell(at)eris(dot)qinetiq(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: lower and upper not UTF-8 safe
Date: 2003-08-05 13:45:50
Message-ID: 21850.1060091150@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Karel Zak <zakkr(at)zf(dot)jcu(dot)cz> writes:
> On Mon, Aug 04, 2003 at 05:03:02PM -0400, Tom Lane wrote:
>> Only if you use a locale that is assuming a character set that is not
>> UTF8 but does have characters with the high bit set. I'm not sure that
>> we can do anything to defend against locale/charset mismatch.

> We can try detect typical locale charset and compare it with actual
> charset used in DB and send NOTICE to FE if it's mismatched. The problem
> is portability of charset detection code, because there is differences
> between OS.

Yeah. If we had a portable, reliable way of testing for incompatibility,
I'd be in favor of just forbidding creation of databases that have
encoding choices incompatible with the server's LC_COLLATE/LC_CTYPE
settings. (If we ever allow those settings to be more dynamic than they
are, then the test would have to be made somewhere else, but for now it'd
be sufficient to put it in CREATE DATABASE.)

But I don't see a portable way to find out what charset a locale
supports. nl_langinfo() isn't in the C standard at all.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shridhar Daithankar 2003-08-05 13:48:34 Re: 7.4 beta binaries
Previous Message Lamar Owen 2003-08-05 13:40:00 Re: 7.4 beta binaries