From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Abhijit Menon-Sen <ams(at)oryx(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: UTF8 or Unicode |
Date: | 2005-02-15 03:05:08 |
Message-ID: | 200502150305.j1F358K20470@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Abhijit Menon-Sen wrote:
> At 2005-02-14 21:14:54 -0500, pgman(at)candle(dot)pha(dot)pa(dot)us wrote:
> >
> > Should our multi-byte encoding be referred to as UTF8 or Unicode?
>
> The *encoding* should certainly be referred to as UTF-8. Unicode is a
> character set, not an encoding; Unicode characters may be encoded with
> UTF-8, among other things.
>
> (One might think of a charset as being a set of integers representing
> characters, and an encoding as specifying how those integers may be
> converted to bytes.)
>
> > I know UTF8 is a type of unicode but do we need to rename anything
> > from Unicode to UTF8?
>
> I don't know. I'll go through the documentation to see if I can find
> anything that needs changing.
I looked at encoding.sgml and that mentions Unicode, and then UTF8 as an
acronym. I am wondering if we need to make UTF8 first and Unicode
second. Does initdb accept UTF8 as an encoding?
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2005-02-15 03:06:02 | Re: 8.0.X and the ARC patent |
Previous Message | Joshua D. Drake | 2005-02-15 02:56:49 | Re: 8.0.X and the ARC patent |