From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, dpage(at)vale-housing(dot)co(dot)uk, oliver(at)opencloud(dot)com, zakkr(at)zf(dot)jcu(dot)cz, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: UTF8 or Unicode |
Date: | 2005-02-25 04:51:16 |
Message-ID: | 200502250451.j1P4pHi06087@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tatsuo Ishii wrote:
> I do not object the changing UNICODE->UTF-8, but all these discussions
> sound a little bit funny to me.
>
> If you want to blame UNICODE, you should blame LATIN1 etc. as
> well. LATIN1(ISO-8859-1) is actually a character set name, not an
> encoding name. ISO-8859-1 can be encoded in 8-bit single byte
> stream. But it can be encoded in 7-bit too. So when we refer to
> LATIN1(ISO-8859-1), it's not clear if it's encoded in 7/8-bit.
Wow, Tatsuo has a point here. Looking at encnames.c, I see:
"UNICODE", PG_UTF8
but also:
"WIN", PG_WIN1251
"LATIN1", PG_LATIN1
and I see conversions for those:
"iso88591", PG_LATIN1
"win", PG_WIN1251
so I see what he is saying. We are not consistent in favoring the
official names vs. the common names.
I will work on a patch that people can review and test.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-02-25 04:59:36 | Re: Can we remove SnapshotSelf? |
Previous Message | Bruce Momjian | 2005-02-25 04:33:39 | Re: BUG #1466: syslogger issues |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-02-25 04:57:12 | Re: Change < to -f in examples with input files |
Previous Message | Bruce Momjian | 2005-02-25 04:33:39 | Re: BUG #1466: syslogger issues |