Quick Links

Re: Status report: regex replacement

From:	Tim Allen <tim(at)proximity(dot)com(dot)au>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Status report: regex replacement
Date:	2003-02-07 02:18:44
Message-ID:	200302071318.44043.tim@proximity.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 7 Feb 2003 00:49, Hannu Krosing wrote:
> Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05:
> > > Perhaps we should not call the encoding UNICODE but UTF8 (which it
> > > really is). UNICODE is a character set which has half a dozen official
> > > encodings and calling one of them "UNICODE" does not make things very
> > > clear.
> >
> > Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely
> > way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use
> > this). I don't know what it is called though.
>
> I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse
> anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used.
>
> UTF-8 seems to be the most popular, but even XML standard requires all
> compliant implementations to deal with at least both UTF-8 and UTF-16.

Strong agreement from me, for whatever value you wish to place on my opinion.
UTF-8 is a preferable name to UNICODE. The case for distinguishing 7-bit from
8-bit latin1 seems much weaker.

Tim

--
-----------------------------------------------
Tim Allen tim(at)proximity(dot)com(dot)au
Proximity Pty Ltd http://www.proximity.com.au/
http://www4.tpg.com.au/users/rita_tim/

In response to

Re: Status report: regex replacement at 2003-02-06 13:49:35 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Christopher Kings-Lynne	2003-02-07 02:27:56	Re: [OpenFTS-general] relor and relkov
Previous Message	Tatsuo Ishii	2003-02-07 02:03:13	Re: Status report: regex replacement