Re: to_json(NULL) should to return JSON null instead NULL

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: to_json(NULL) should to return JSON null instead NULL
Date: 2015-08-29 20:47:32
Message-ID: 55E21A64.4090600@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/29/15 3:27 PM, Tom Lane wrote:
> Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
>> On 8/29/15 12:29 PM, Pavel Stehule wrote:
>>> what is correct from JSON perspective? All fields with NULL
>
>> ISTM that the whole purpose of to_json is to properly jsonify something,
>> and the proper json form for "undefined" is 'null', is it not?
>
> What's not entirely clear is what we should do with cases like
>
> regression=# select array_to_json(null::int[]);
> array_to_json
> ---------------
>
> (1 row)
>
> regression=# select row_to_json(null::record);
> row_to_json
> -------------
>
> (1 row)
>
> If we leave those alone (and in the latter case, in particular, there is
> not enough information available to do much else) then it's not so clear
> that changing to_json() is really improving consistency overall.
> For instance, do we really want row_to_json(null::record) and
> to_json(null::record) giving different results? Or if we make them
> both return "null", that breaks the previous invariant that row_to_json
> always yields a JSON object.

The tricky part is that if you're calling any of those functions to find
the value to stick inside a JSON array or object then NULL is definitely
incorrect. IE: if you were trying to turn the results of this

create table t(a int, b text);
insert into t(a) values(1);

Into a JSON object, you'd want '{"a":1,"b":null}', not NULL. Of course
you'd just use row_to_json() for something that simple, but if you were
doing something more complex you might have to code a transform yourself.

> An advantage of leaving these things as strict is that the user can easily
> substitute whatever specific behavior she wants for NULLs via coalesce(),
> as was shown upthread. If we put in a different behavior, then the
> only way to override it would be with a CASE, which is tedious and creates
> multiple-evaluation issues.

Certainly true.

The downside to leaving it alone is this will probably be hard to debug
if you're using it to build a complex JSON object. One NULL ends up in
the right place and suddenly your whole output becomes NULL. I think
this is why %s works the way it does in format as well.

If we do change it I think it best to add an argument to control what it
does with a NULL so you can get whichever you need. It might be worth
adding to (array|record)_to_json as well, though I think the use case
for those is markedly different than for plain to_json, so maybe not.

> I'm not necessarily against changing it --- but it doesn't seem entirely
> black-and-white to me, and we do now have a couple of versions worth
> of precedent we'd be breaking with.
>
> If we do vote to change it, I'd want to do so now (ie in 9.5) rather than
> create yet another year's worth of precedent.

I wonder how much people are actually using to_json(). I've done some
amount of JSON massaging and don't recall needing it yet.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-08-29 20:49:56 Re: 9.4 broken on alpha
Previous Message Jim Nasby 2015-08-29 20:27:15 Re: buffer README is out of date