Quick Links

Re: UTF16 surrogate pairs in UTF8 encoding

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Marko Kreen <markokr(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: UTF16 surrogate pairs in UTF8 encoding
Date:	2010-09-08 10:35:18
Message-ID:	1283942118.18999.1.camel@fsopti579.F-Secure.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On ons, 2010-09-08 at 10:18 +0300, Marko Kreen wrote:
> On 9/7/10, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> > On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote:
> > > > We combine the surrogate pair components to a single code point and
> > > > encode that in UTF-8. We don't encode the components separately;
> > > that
> > > > would be wrong.
> > >
> > > Oh, OK. Should the docs make that a bit clearer?
> >
> >
> > Done.
>
> This is confusing:
>
> (When surrogate
> pairs are used when the server encoding is <literal>UTF8</>, they
> are first combined into a single code point that is then encoded
> in UTF-8.)
>
> So something else happens if encoding is not UTF8?

Then you can't specify surrogate pairs because they are outside of the
ASCII range, per constraint mentioned earlier in the paragraph.

> I think this part can be simply removed, it does not add anything.
>
> Or say that surrogate pairs are only allowed in UTF8 encoding.
> Reason is that you cannot encode 0..7F codepoints with them,
> and only those are allowed to be given numerically. But this is
> already mentioned before.

Well, Tom wanted an additional explanation. I personally agree with
you; this is not the place to explain encoding and Unicode internals,
when really the code only does what it's supposed to.

In response to

Re: UTF16 surrogate pairs in UTF8 encoding at 2010-09-08 07:18:36 from Marko Kreen

Responses

Re: UTF16 surrogate pairs in UTF8 encoding at 2010-09-08 10:45:37 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2010-09-08 10:38:01	Re: Synchronization levels in SR
Previous Message	Boszormenyi Zoltan	2010-09-08 10:04:15	Re: Synchronization levels in SR