From: | "MauMau" <maumau307(at)gmail(dot)com> |
---|---|
To: | "Robert Haas" <robertmhaas(at)gmail(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>, "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: UTF8 national character data type support WIP patch and list of open issues. |
Date: | 2013-09-19 22:42:19 |
Message-ID: | 37B76474BB3149FD841373E12E355851@maumau |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
> That may be what's important to you, but it's not what's important to
> me.
National character types support may be important to some potential users of
PostgreSQL and the popularity of PostgreSQL, not me. That's why national
character support is listed in the PostgreSQL TODO wiki. We might be losing
potential users just because their selection criteria includes national
character support.
> I am not keen to introduce support for nchar and nvarchar as
> differently-named types with identical semantics.
Similar examples already exist:
- varchar and text: the only difference is the existence of explicit length
limit
- numeric and decimal
- int and int4, smallint and int2, bigint and int8
- real/double precison and float
In addition, the SQL standard itself admits:
"The <key word>s NATIONAL CHARACTER are used to specify the character type
with an implementation-
defined character set. Special syntax (N'string') is provided for
representing literals in that character set.
...
"NATIONAL CHARACTER" is equivalent to the corresponding <character string
type> with a specification
of "CHARACTER SET CSN", where "CSN" is an implementation-defined <character
set name>."
"A <national character string literal> is equivalent to a <character string
literal> with the "N" replaced by
"<introducer><character set specification>", where "<character set
specification>" is an implementation-
defined <character set name>."
> And I think it's an
> even worse idea to introduce them now, making them work one way, and
> then later change the behavior in a backward-incompatible fashion.
I understand your feeling. The concern about incompatibility can be
eliminated by thinking the following way. How about this?
- NCHAR can be used with any database encoding.
- At first, NCHAR is exactly the same as CHAR. That is,
"implementation-defined character set" described in the SQL standard is the
database character set.
- In the future, the character set for NCHAR can be selected at database
creation like Oracle's CREATE DATABAWSE .... NATIONAL CHARACTER SET
AL16UTF16. The default it the database set.
Could you tell me what kind of specification we should implement if we
officially support national character types?
Regards
MauMau
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-09-19 22:59:29 | Re: [PERFORM] encouraging index-only scans |
Previous Message | Steve Singer | 2013-09-19 22:31:38 | Re: record identical operator - Review |