From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Joey Adams <joeyadams3(dot)14159(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Claes Jakobsson <claes(at)surfar(dot)nu>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Jan Urbański <wulczer(at)wulczer(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Jan Wieck <janwieck(at)yahoo(dot)com> |
Subject: | Re: JSON for PG 9.2 |
Date: | 2012-01-20 05:07:06 |
Message-ID: | 21994.1327036026@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/19/2012 04:12 PM, Robert Haas wrote:
>> On Thu, Jan 19, 2012 at 4:07 PM, Andrew Dunstan<andrew(at)dunslane(dot)net> wrote:
>>> The spec only allows unescaped Unicode chars (and for our purposes that
>>> means UTF8). An unescaped non-ASCII character in, say, ISO-8859-1 will
>>> result in something that's not legal JSON.
>> I understand. I'm proposing that we not care. In other words, if the
>> server encoding is UTF-8, it'll really be JSON. But if the server
>> encoding is something else, it'll be almost-JSON.
> Of course, for data going to the client, if the client encoding is UTF8,
> they should get legal JSON, regardless of what the database encoding is,
> and conversely too, no?
Yes. I think this argument has been mostly theologizing, along the
lines of how many JSON characters can dance on the head of a pin.
>From a user's perspective, the database encoding is only a constraint on
which characters he can store. He does not know or care what the bit
representation is inside the server. As such, if we store a non-ASCII
character in a JSON string, it's valid JSON as far as the user is
concerned, so long as that character exists in the Unicode standard.
If his client encoding is UTF8, the value will be letter-perfect JSON
when it gets to him; and if his client encoding is not UTF8, then he's
already pretty much decided that he doesn't give a fig about the
Unicode-centricity of the JSON spec, no?
So I'm with Robert: we should just plain not care. I would further
suggest that maybe what we should do with incoming JSON escape sequences
is convert them to Unicode code points and then to the equivalent
character in the database encoding (or throw error if there is none).
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2012-01-20 05:09:24 | Re: Review of patch renaming constraints |
Previous Message | Peter Eisentraut | 2012-01-20 05:01:46 | Re: pg_upgrade with plpython is broken |