| From: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: jsonb, unicode escapes and escaped backslashes | 
| Date: | 2015-02-02 13:01:55 | 
| Message-ID: | CA+TgmobgwqTezL3omnFfX7CdMSvatbvM3gFXG-8h16zpt++trw@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sat, Jan 31, 2015 at 8:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I understand Andrew to be saying that if you take a 6-character string
>> and convert it to a JSON string and then back to text, you will
>> *usually* get back the same 6 characters you started with ... unless
>> the first character was \, the second u, and the remainder hexadecimal
>> digits.  Then you'll get back a one-character string or an error
>> instead.  It's not hard to imagine that leading to surprising
>> behavior, or even security vulnerabilities in applications that aren't
>> expecting such a translation to happen under them.
>
> That *was* the case, with the now-reverted patch that changed the escaping
> rules.  It's not anymore:
>
> regression=# select to_json('\u1234'::text);
>   to_json
> -----------
>  "\\u1234"
> (1 row)
>
> When you convert that back to text, you'll get \u1234, no more and no
> less.  For example:
>
> regression=# select array_to_json(array['\u1234'::text]);
>  array_to_json
> ---------------
>  ["\\u1234"]
> (1 row)
>
> regression=# select array_to_json(array['\u1234'::text])->0;
>  ?column?
> -----------
>  "\\u1234"
> (1 row)
>
> regression=# select array_to_json(array['\u1234'::text])->>0;
>  ?column?
> ----------
>  \u1234
> (1 row)
>
> Now, if you put in '"\u1234"'::jsonb and extract that string as text,
> you get some Unicode character or other.  But I'd say that a JSON user
> who is surprised by that doesn't understand JSON, and definitely that they
> hadn't read more than about one paragraph of our description of the JSON
> types.
Totally agree.  That's why I think reverting the patch was the right
thing to do.
-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2015-02-02 13:04:36 | Re: Implementation of global temporary tables? | 
| Previous Message | Pavel Stehule | 2015-02-02 13:01:45 | Re: Implementation of global temporary tables? |