From: | Reinhard Max <max(at)suse(dot)de> |
---|---|
To: | Vsevolod Lobko <seva(at)sevasoft(dot)kiev(dot)ua> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal |
Date: | 2001-09-04 12:08:05 |
Message-ID: | Pine.LNX.4.33.0109041317540.8768-100000@wotan.suse.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Hi,
sorry for stepping late into this discussion.
I've been on vacation for two weeks.
On Thu, 23 Aug 2001, Vsevolod Lobko wrote:
> > > Patch assumes that database encoding and system encoding of Tcl is
> > > equal.
> >
> > Hmm, is that a tenable assumption? I don't know, I'm just asking.
>
> Yes, because it does 8-bit to unicode conversion and must to know
> codepage for 8-bit characters. Unfortunately charset names for tcl
> and postgres does not match, so this demands additional field in
> charset tables or additional table :((
I think you can't assume that a database has always the same encoding
as Tcl's system encoding. For pl/tcl you could set the system encoding
to the database's encoding, but then you'd need that additional name
conversion table anyway be it a database table or hardcoded. For PgTcl
it is definitely up to the user which system encoding the interpreter
has.
I for example create my databases in UNICODE (to get PostgreSQL
working with Tcl 8.3 and without patching pl/tcl or PgTcl), but my
Tcl-Interpreter's system encoding is iso-8859-1.
So basically there are two possibilities:
a) Patch pl/tcl and PgTcl to do the code conversion, but do it right
by using the Database's encoding instead of Tcl's system encoding.
b) Require databases to be in UNICODE if they are to be accessed
from Tcl >= 8.1 so that the strings that come out of the database
are already UTF-8.
For b) it would be nice to have a per-database attribute that
specifies the default client encoding that is used for clients that
don't explicitely set an encoding. I think of something like:
$ createdb --encoing UNICODE --default-client-encoding LATIN1 foo
This database could be used from Tcl without any code conversion, but
would look like it was in LATIN1 for other clients (e.g. psql) if they
don't explicitely set an encoding.
I'd vote for b), because I think there is a general movement towards
Unicode anyways.
cu
Reinhard
From | Date | Subject | |
---|---|---|---|
Next Message | Karel Zak | 2001-09-04 13:42:02 | Re: [PATCHES] to_char and Roman Numeral (RN) bug |
Previous Message | Peter Eisentraut | 2001-09-04 10:25:48 | Re: Bytea/Base64 encoders for libpq - interested? |