From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: json/jsonb inconsistence |
Date: | 2014-05-29 12:12:16 |
Message-ID: | 53872420.2000207@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/29/2014 07:55 AM, Teodor Sigaev wrote:
> # select '"\uaBcD"'::json;
> json
> ----------
> "\uaBcD"
>
> but
>
> # select '"\uaBcD"'::jsonb;
> ERROR: invalid input syntax for type json
> LINE 1: select '"\uaBcD"'::jsonb;
> ^
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: ...
>
> and
>
> # select '"\uaBcD"'::json -> 0;
> ERROR: invalid input syntax for type json
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: ...
> Time: 0,696 ms
>
> More than, json looks strange:
>
> # select '["\uaBcD"]'::json;
> json
> ------------
> ["\uaBcD"]
>
> but
>
> # select '["\uaBcD"]'::json->0;
> ERROR: invalid input syntax for type json
> DETAIL: Unicode escape values cannot be used for code point values
> above 007F when the server encoding is not UTF8.
> CONTEXT: JSON data, line 1: [...
>
> Looks like random parse rules.
>
It is documented that for json we don't check the validity of unicode
escapes until we try to use them. That's because the original json
parser didn't check them, and if we started doing so now users who
pg_upgraded would find themselves with invalid data in the database. The
rules for jsonb are more strict, because we actually resolve the unicode
escapes when constructing the jsonb object. There is nothing at all
random about it, although I agree it's slightly inconsistent. It's been
the subject of some discussion on -hackers previously, IIRC. I actually
referred to this difference in my talk at pgCon last Friday.
Frankly, if you want to use json/jsonb, you are going to be best served
by using a UTF8-encoded database, or not using non-ascii unicode escapes
in json at all.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2014-05-29 12:14:00 | Re: [PATCH] Replacement for OSSP-UUID for Linux and BSD |
Previous Message | Greg Stark | 2014-05-29 12:12:14 | backup_label revisited |