From: | Alban Hertroys <haramrae(at)gmail(dot)com> |
---|---|
To: | Bryn Llewellyn <bryn(at)yugabyte(dot)com> |
Cc: | Tom Lane PostgreSQL <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, mmoncure(at)gmail(dot)com, pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really? |
Date: | 2022-06-18 10:28:01 |
Message-ID: | F24FA846-001B-46A2-A399-59A1B72AB78A@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On 18 Jun 2022, at 2:14, Bryn Llewellyn <bryn(at)yugabyte(dot)com> wrote:
>
> I implemented two complementary functions:
>
> —"no_null_keys()" checks that a "jsonb" value has no occurrences of « "some key": null »
>
> —"strip_null_keys()" removes « "some key": null » occurrences from a "jsonb" value
>
> The code checks with "no_null_keys()" that, as expected, no ingested JSON document has an occurrence of « "some key": null ».
>
> And it uses "strip_null_keys()" on the output of "to_jsonb()" — and, as appropriate, any other built-in JSON function that produces a "jsonb" value.
>
> It was straightforward to implement these two functions by using REGEXP built-in functionality on the canonically formatted "text" value produced by the "jsonb::text" typecast.
In my experience, using regular expressions applied to document formats tends to get you false positives. I’d be worried about books with titles similar to 'How we wrote a regular expression to detect occurrences of "some key": null in our JSON documents', for example.
For stripping those null occurrences, you are aware of the json_strip_nulls(json) and jsonb_strip_nulls(jsonb) functions, right?
For detecting them on a recent PG, the @? operator or json_path_exists(json, jsonpath) functions would probably do the trick.
I am not too familiar with JSONPATH expressions, but I expect (it passed some preliminary testing) this would detect your nulls just fine, while taking JSON semantics into account:
jsonb_path_exists(document, '$.** ? (@ == null)'::jsonpath)
For PG-specifics on JSONPATH, see section 9.16.2 on: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-OP-TABLE
A recursive query is another possible solution. It would probably perform far worse, but I find them more rewarding to write. Some people prefer Sodoku.
Regards,
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.
From | Date | Subject | |
---|---|---|---|
Next Message | Bryn Llewellyn | 2022-06-19 03:33:24 | Re: ISBN (was: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?) |
Previous Message | Peter J. Holzer | 2022-06-18 07:07:00 | Re: |