From: | Marko Kreen <markokr(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: UTF16 surrogate pairs in UTF8 encoding |
Date: | 2010-09-08 07:18:36 |
Message-ID: | AANLkTikiWsunoVFqb0mceH59LvSQf1vt7-QCZeJL5ZGY@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 9/7/10, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote:
> > > We combine the surrogate pair components to a single code point and
> > > encode that in UTF-8. We don't encode the components separately;
> > that
> > > would be wrong.
> >
> > Oh, OK. Should the docs make that a bit clearer?
>
>
> Done.
This is confusing:
(When surrogate
pairs are used when the server encoding is <literal>UTF8</>, they
are first combined into a single code point that is then encoded
in UTF-8.)
So something else happens if encoding is not UTF8?
I think this part can be simply removed, it does not add anything.
Or say that surrogate pairs are only allowed in UTF8 encoding.
Reason is that you cannot encode 0..7F codepoints with them,
and only those are allowed to be given numerically. But this is
already mentioned before.
--
marko
From | Date | Subject | |
---|---|---|---|
Next Message | Dean Rasheed | 2010-09-08 08:00:33 | Re: WIP: Triggers on VIEWs |
Previous Message | Fujii Masao | 2010-09-08 06:39:34 | Re: Synchronization levels in SR |