| From: | Marko Karppinen <marko(at)karppinen(dot)fi> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-hackers(at)postgreSQL(dot)org |
| Subject: | Re: Rough draft for Unicode-aware UPPER()/LOWER()/INITCAP() |
| Date: | 2004-05-16 18:56:16 |
| Message-ID: | B5E5BD42-A76A-11D8-9207-000A95C56374@karppinen.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Tom Lane wrote:
> This code will only work if the database is running under an LC_CTYPE
> setting that implies the same encoding specified by server_encoding.
> However, I don't see that as a fatal objection, because in point of
> fact
> the existing upper/lower code assumes the same thing.
I think this interaction between the locale and server_encoding is
confusing. Is there any use case for running an incompatible mix?
If not, would it not make sense to fetch initdb's default database
encoding with nl_langinfo(CODESET) instead of using SQL_ASCII?
initdb could even emit a warning if the --encoding option was
used without also specifying --no-locale.
Using nl_langinfo(CODESET) was discussed and quietly dismissed a
year ago (although the topic was the client encoding back then).
But I think that the idea is worth revisiting because it would
allow UPPER() and LOWER() to work correctly with international
alphabets -- out of the box and without configuration -- on a
wide variety of modern systems.
mk
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2004-05-16 19:45:06 | Re: Call for 7.5 feature completion |
| Previous Message | Jan Wieck | 2004-05-16 18:46:38 | Re: Call for 7.5 feature completion |