Quick Links

Re: UTF8 national character data type support WIP patch and list of open issues.

From:	"MauMau" <maumau307(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>, "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: UTF8 national character data type support WIP patch and list of open issues.
Date:	2013-09-18 22:46:37
Message-ID:	1191A5384BD641C68D288AF210BEFDA8@maumau
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Another point to keep in mind is that UTF16 is not really any easier
> to deal with than UTF8, unless you write code that fails to support
> characters outside the basic multilingual plane. Which is a restriction
> I don't believe we'd accept. But without that restriction, you're still
> forced to deal with variable-width characters; and there's nothing very
> nice about the way that's done in UTF16. So on the whole I think it
> makes more sense to use UTF8 for this.

I feel so. I guess why Windows, Java, and Oracle chose UTF-16 is ... it was
UCS-2 only with BMP when they chose it. So character handling was easier
and faster thanks to fixed-width encoding.

Regards
MauMau

In response to

Re: UTF8 national character data type support WIP patch and list of open issues. at 2013-09-18 17:18:00 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dimitri Fontaine	2013-09-18 22:55:00	Re: record identical operator
Previous Message	MauMau	2013-09-18 22:42:29	Re: UTF8 national character data type support WIP patch and list of open issues.