From: | Florian Weimer <fweimer(at)bfk(dot)de> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | Alexander Shulgin <ash(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Making TEXT NUL-transparent |
Date: | 2011-12-23 10:52:13 |
Message-ID: | 82obuza7c2.fsf@mid.bfk.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
* Florian Pflug:
> On Nov24, 2011, at 10:54 , Florian Weimer wrote:
>>> Or is it not only about being able to *store* NULs in a text field?
>>
>> No, the entire core should be NUL-transparent.
>
> That's unlikely to happen.
Yes, with the type input/output functions tied to NUL-terminated
strings, that seems indeed unlikely to happen.
> A more realistic approach would be to solve this only for UTF-8
> encoded strings by encoding the NUL character not as a single 0 byte,
> but as sequence of non-0 bytes.
0xFF cannot occur in valid UTF-8, so that's one possibility.
> Java, for example, seems to use it to serialize Strings (which may contain
> NUL characters) to UTF-8.
Only internally in the VM. UTF-8 produced by the I/O encoder/decoders
produces and consumes NUL bytes.
> Should you try to add a new encoding which supports that, you might also
> want to allow CESU-8-style encoding of UTF-16 surrogate pairs. This means
> that code points representable by UTF-16 surrogate pairs may be encoded by
> separately encoding the two surrogate characters in UTF-8.
I'm not sure if this is a good idea. The motivation behind CESU-8 is
that it sorts byte-encoded strings in the same order as UTF-16, which is
a completely separate concern.
--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99
From | Date | Subject | |
---|---|---|---|
Next Message | Kohei KaiGai | 2011-12-23 10:56:57 | Re: [v9.2] Fix Leaky View Problem |
Previous Message | Daniel Farina | 2011-12-23 10:45:22 | Re: Extensions and 9.2 |