Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
Cc: Tom Lane PostgreSQL <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, mmoncure(at)gmail(dot)com, pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?
Date: 2022-06-18 10:28:01
Message-ID: F24FA846-001B-46A2-A399-59A1B72AB78A@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 18 Jun 2022, at 2:14, Bryn Llewellyn <bryn(at)yugabyte(dot)com> wrote:
>
> I implemented two complementary functions:
>
> —"no_null_keys()" checks that a "jsonb" value has no occurrences of « "some key": null »
>
> —"strip_null_keys()" removes « "some key": null » occurrences from a "jsonb" value
>
> The code checks with "no_null_keys()" that, as expected, no ingested JSON document has an occurrence of « "some key": null ».
>
> And it uses "strip_null_keys()" on the output of "to_jsonb()" — and, as appropriate, any other built-in JSON function that produces a "jsonb" value.
>
> It was straightforward to implement these two functions by using REGEXP built-in functionality on the canonically formatted "text" value produced by the "jsonb::text" typecast.

In my experience, using regular expressions applied to document formats tends to get you false positives. I’d be worried about books with titles similar to 'How we wrote a regular expression to detect occurrences of "some key": null in our JSON documents', for example.

For stripping those null occurrences, you are aware of the json_strip_nulls(json) and jsonb_strip_nulls(jsonb) functions, right?

For detecting them on a recent PG, the @? operator or json_path_exists(json, jsonpath) functions would probably do the trick.
I am not too familiar with JSONPATH expressions, but I expect (it passed some preliminary testing) this would detect your nulls just fine, while taking JSON semantics into account:

jsonb_path_exists(document, '$.** ? (@ == null)'::jsonpath)

For PG-specifics on JSONPATH, see section 9.16.2 on: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-OP-TABLE

A recursive query is another possible solution. It would probably perform far worse, but I find them more rewarding to write. Some people prefer Sodoku.

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bryn Llewellyn 2022-06-19 03:33:24 Re: ISBN (was: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?)
Previous Message Peter J. Holzer 2022-06-18 07:07:00 Re: