Locales and Encodings

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Locales and Encodings
Date: 2007-10-12 11:24:34
Message-ID: 877ilsh8y5.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


It seems like the root of the problems we're butting our heads against with
encoding and locale is all the same issue: it's nonsensical to take the locale
at initdb time per-cluster and then allow user-specified encoding
per-database. If anything it would make more sense to go the other way around.

But actually it seems to me we could allow changing both on a per-database
basis with certain restrictions:

. template0 is always SQL_ASCII with locale C

. when creating a new database you can specify the encoding and locale and we
check that they're compatible.

. when creating a new database from a template the new locale and encoding
must be identical to the template database's encoding and locale. Unless the
template is template0 in which case we rebuild all indexes after copying.

We could liberalize this last restriction if we created a new encoding like
SQL_ASCII but which enforces 7-bit ascii. But then the index rebuild step
could take a long time.

This would make the whole locale/encoding issue make much more transparent. In
database listings you would see both listed alongside, you wouldn't be bound
by any initdb environment choices, and errors when running create database
would be able to tell you exactly what you're doing wrong and what you have to
do to avoid the problem.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Klyukin 2007-10-12 11:40:11 Re: Some questions about mammoth replication
Previous Message Dave Page 2007-10-12 11:20:08 pg_tablespace_size()