Re: Proposal: CREATE CONVERSION

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: CREATE CONVERSION
Date: 2002-07-09 22:20:54
Message-ID: Pine.LNX.4.44.0207091858540.1247-100000@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Lockhart writes:

> An aside: I was thinking about this some, from the PoV of using our
> existing type system to handle this (as you might remember, this is an
> inclination I've had for quite a while). I think that most things line
> up fairly well to allow this (and having transaction-enabled features
> may require it), but do notice that the SQL feature of allowing a
> different character set for every column *name* does not map
> particularly well to our underlying structures.

There more I think about it, the more I come to the conclusion that the
SQL framework for "character sets" is both bogus and a red herring. (And
it begins with figuring out exactly what a character set is, as opposed
to a form-of-use, a.k.a.(?) encoding, but let's ignore that.)

The ability to store each column value in a different encoding sounds
interesting, because it allows you to create tables such as

product_id | product_name_en | product_name_kr | product_name_jp

but you might as well create a table such as

product_id | lang | product_name

with product_name in Unicode, and have a more extensible application that
way, too.

I think it's fine to have the encoding fixed for the entire database. It
sure makes coding easier. If you want to be international, you use
Unicode. If not you can "optimize" your database by using a more
efficient encoding. In fact, I think we should consider making UTF-8 the
default encoding sometime.

The real issue is the collation. But the collation is a small subset of
the whole locale/character set gobbledigook. Standardized collation rules
in standardized forms exist. Finding/creating routines to interpret and
apply them should be the focus. SQL's notion to funnel the decision which
collation rule to apply through the character sets is bogus. It's
impossible to pick a default collation rule for many character sets
without applying bias.

--
Peter Eisentraut peter_e(at)gmx(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2002-07-09 22:21:23 Re: Proposal: CREATE CONVERSION
Previous Message Peter Eisentraut 2002-07-09 22:20:21 Re: (A) native Windows port