From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: to_json(NULL) should to return JSON null instead NULL |
Date: | 2015-08-29 20:47:32 |
Message-ID: | 55E21A64.4090600@BlueTreble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 8/29/15 3:27 PM, Tom Lane wrote:
> Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
>> On 8/29/15 12:29 PM, Pavel Stehule wrote:
>>> what is correct from JSON perspective? All fields with NULL
>
>> ISTM that the whole purpose of to_json is to properly jsonify something,
>> and the proper json form for "undefined" is 'null', is it not?
>
> What's not entirely clear is what we should do with cases like
>
> regression=# select array_to_json(null::int[]);
> array_to_json
> ---------------
>
> (1 row)
>
> regression=# select row_to_json(null::record);
> row_to_json
> -------------
>
> (1 row)
>
> If we leave those alone (and in the latter case, in particular, there is
> not enough information available to do much else) then it's not so clear
> that changing to_json() is really improving consistency overall.
> For instance, do we really want row_to_json(null::record) and
> to_json(null::record) giving different results? Or if we make them
> both return "null", that breaks the previous invariant that row_to_json
> always yields a JSON object.
The tricky part is that if you're calling any of those functions to find
the value to stick inside a JSON array or object then NULL is definitely
incorrect. IE: if you were trying to turn the results of this
create table t(a int, b text);
insert into t(a) values(1);
Into a JSON object, you'd want '{"a":1,"b":null}', not NULL. Of course
you'd just use row_to_json() for something that simple, but if you were
doing something more complex you might have to code a transform yourself.
> An advantage of leaving these things as strict is that the user can easily
> substitute whatever specific behavior she wants for NULLs via coalesce(),
> as was shown upthread. If we put in a different behavior, then the
> only way to override it would be with a CASE, which is tedious and creates
> multiple-evaluation issues.
Certainly true.
The downside to leaving it alone is this will probably be hard to debug
if you're using it to build a complex JSON object. One NULL ends up in
the right place and suddenly your whole output becomes NULL. I think
this is why %s works the way it does in format as well.
If we do change it I think it best to add an argument to control what it
does with a NULL so you can get whichever you need. It might be worth
adding to (array|record)_to_json as well, though I think the use case
for those is markedly different than for plain to_json, so maybe not.
> I'm not necessarily against changing it --- but it doesn't seem entirely
> black-and-white to me, and we do now have a couple of versions worth
> of precedent we'd be breaking with.
>
> If we do vote to change it, I'd want to do so now (ie in 9.5) rather than
> create yet another year's worth of precedent.
I wonder how much people are actually using to_json(). I've done some
amount of JSON massaging and don't recall needing it yet.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-08-29 20:49:56 | Re: 9.4 broken on alpha |
Previous Message | Jim Nasby | 2015-08-29 20:27:15 | Re: buffer README is out of date |