From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: jsonb, unicode escapes and escaped backslashes |
Date: | 2015-01-28 17:18:10 |
Message-ID: | 54C919D2.6040700@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/28/2015 12:50 AM, Noah Misch wrote:
> On Tue, Jan 27, 2015 at 03:56:22PM -0500, Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> On 01/27/2015 02:28 PM, Tom Lane wrote:
>>>> Well, we can either fix it now or suffer with a broken representation
>>>> forever. I'm not wedded to the exact solution I described, but I think
>>>> we'll regret it if we don't change the representation.
>> So at this point I propose that we reject \u0000 when de-escaping JSON.
> I would have agreed on 2014-12-09, and this release is the last chance to make
> such a change. It is a bold wager that could pay off, but -1 from me anyway.
> I can already envision the blog post from the DBA staying on 9.4.0 because
> 9.4.1 pulled his ability to store U+0000 in jsonb. jsonb was *the* top-billed
> 9.4 feature, and this thread started with Andrew conveying a field report of a
> scenario more obscure than storing U+0000. Therefore, we have to assume many
> users will notice the change. This move would also add to the growing
> evidence that our .0 releases are really beta(N+1) releases in disguise.
>
>> Anybody who's seriously unhappy with that can propose a patch to fix it
>> properly in 9.5 or later.
> Someone can still do that by introducing a V2 of the jsonb binary format and
> preserving the ability to read both formats. (Too bad Andres's proposal to
> include a format version didn't inform the final format, but we can wing it.)
> I agree that storing U+0000 as 0x00 is the best end state.
>
>
We need to make up our minds about this pretty quickly. The more radical
move is likely to involve quite a bit of work, ISTM.
It's not clear to me how we should represent a unicode null. i.e. given
a json of '["foo\u0000bar"]', I get that we'd store the element as
'foo\x00bar', but what is the result of
(jsonb '["foo\u0000bar"')->>0
It's defined to be text so we can't just shove a binary null in the
middle of it. Do we throw an error?
And I still want to hear more voices on the whole direction we want to
take this.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2015-01-28 17:18:53 | Re: PQgetssl() and alternative SSL implementations |
Previous Message | Heikki Linnakangas | 2015-01-28 17:09:47 | Re: Sequence Access Method WIP |