From: | Kirk Parker <khp(at)equatoria(dot)us> |
---|---|
To: | vavankaru(at)gmail(dot)com, pgsql-docs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Ambiguity in IS JSON description and logic |
Date: | 2025-03-20 14:22:14 |
Message-ID: | CANwZ8rmj7Mr_ZaZsfH5cx5uNzRZ91aHaRWe=ViJGO=U7rKjFCQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <khp(at)equatoria(dot)us> wrote:
>
> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
> noreply(at)postgresql(dot)org> wrote:
>
>> The following documentation comment has been logged on the website:
>>
>> Page: https://www.postgresql.org/docs/17/functions-json.html
>> Description:
>>
>> On the manual page
>> https://www.postgresql.org/docs/current/functions-json.html, in the Table
>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>> object in the expression is also tested to see if it has duplicate keys."
>> And such text is ambiguous, because the term "object" has certain meaning
>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>> check for duplicated keys any array element not object. For objects, both
>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it
>> is at the same time with and without unique values, how it is possible?),
>> i.e. it works the same as just IS JSON ARRAY. The example code that
>> confirms
>> my reasoning:
>> SELECT
>> js.vl AS "tested str",
>>
>>
>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>> keys",
>>
>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys",
>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys",
>>
>>
>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
>> js.vl IS JSON ARRAY ".. array"
>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>> "a2"}]'),
>> ('["a", "a"]')) AS js(vl);
>>
>> I'm not sure what should be the right logic for this option, for me it
>> looks
>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if
>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>> showed above should contain "array" instead of "object" because for
>> objects
>> it returns false independently of applied option. I tested it on
>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit".
>>
>
> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are duplicate
> keys", it's just a way of stating the default explicitly. In other words it
> means "w/o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON
> OBJECT WITHOUT UNIQUE KEYS will both always return identical results on the
> same JSON expression.
>
> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
> intuitive, but for JSON arrays -- which don't have any concept of keys;
> JSON arrays are just ordered lists -- it means "does this array contain any
> embedded objects with duplicate keys".
>
> See:
>
> SELECT js,
> js IS JSON "json?",
> js IS JSON OBJECT "object?",
> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
> js IS JSON ARRAY "array?",
> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
> FROM (VALUES
> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK
> ('{"c":2,"d":3}'), -- expect t for ALL object tests
> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
> nested objects
> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
> recursively for embedded objecs
> ) foo(js);
>
>
> A couple of side notes:
>
> 1. Your first data example is not JSON at all. It's helpful for this kind
> of test to include a plain IS JSON column, since any of the IS JSON X tests
> can fail for two reasons: (a) it's not JSON, or (b) it is JSON but it's not
> an X.
>
> 2. Curiously, the JSON spec itself is completely silent on the meaning of
> objects with duplicate keys. PostgreSQL is more helpful in this
> regard--the docs explicitly state that the last value is the one that is
> retained by JSONB and used in processing functions.
>
>
To improve the documentation here, I would suggest simply adding the word
"recursively" after "tested":
If WITH UNIQUE KEYS is specified, then any object in the *expression* is
also tested recursively to see if it has duplicate keys
As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means
"without testing for uniqueness" rather than "confirming that non-unique
keys are present", I will defer to anyone who has more familiarity with the
PostgreSQL documentation style. I probed around looking at other
situations where there are phrases that make the default condition explicit
(e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my
attempts ended up too wordy for my liking.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-03-20 16:04:19 | Re: WHEN SQLSTATE '00000' THEN equals to WHEN OTHERS THEN |
Previous Message | Kirk Parker | 2025-03-20 14:08:50 | Re: Ambiguity in IS JSON description and logic |