Re: Ambiguity in IS JSON description and logic

From: noname stranger <vavankaru(at)gmail(dot)com>
To: Kirk Parker <khp(at)equatoria(dot)us>
Cc: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: Ambiguity in IS JSON description and logic
Date: 2025-03-20 16:06:50
Message-ID: CAMRLGLzWEd-W9EGjV4wfB45OOWTxDcPq8yYEaJyev6Ad8mOAsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hello Kirk,

Thank you for your answer and detailed explanation. You are absolutely
right, I made a mistake in my reasoning. On the other hand, some ambiguity
exists.
It is a perfect idea to clarify what WITHOUT UNIQUE KEYS means. Such
clarification could help with understanding the logic of IS JSON faster and
avoiding confusion and ambiguity. I found the same option for a few JSON
creation functions in table 9.47 at the same page, like json_object() and
json(). Their descriptions also don't clarify this point.

Best regards,
Valery

чт, 20 мар. 2025 г. в 15:22, Kirk Parker <khp(at)equatoria(dot)us>:

> On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <khp(at)equatoria(dot)us> wrote:
>
>>
>> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
>> noreply(at)postgresql(dot)org> wrote:
>>
>>> The following documentation comment has been logged on the website:
>>>
>>> Page: https://www.postgresql.org/docs/17/functions-json.html
>>> Description:
>>>
>>> On the manual page
>>> https://www.postgresql.org/docs/current/functions-json.html, in the
>>> Table
>>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>>> object in the expression is also tested to see if it has duplicate keys."
>>> And such text is ambiguous, because the term "object" has certain meaning
>>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>>> check for duplicated keys any array element not object. For objects, both
>>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true
>>> (it
>>> is at the same time with and without unique values, how it is possible?),
>>> i.e. it works the same as just IS JSON ARRAY. The example code that
>>> confirms
>>> my reasoning:
>>> SELECT
>>> js.vl AS "tested str",
>>>
>>>
>>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>>> keys",
>>>
>>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ
>>> keys",
>>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ
>>> keys",
>>>
>>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
>>> js.vl IS JSON ARRAY ".. array"
>>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>>> "a2"}]'),
>>> ('["a", "a"]')) AS js(vl);
>>>
>>> I'm not sure what should be the right logic for this option, for me it
>>> looks
>>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but
>>> if
>>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
>>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>>> showed above should contain "array" instead of "object" because for
>>> objects
>>> it returns false independently of applied option. I tested it on
>>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120,
>>> 64-bit".
>>>
>>
>> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are
>> duplicate keys", it's just a way of stating the default explicitly. In
>> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJECT
>> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical
>> results on the same JSON expression.
>>
>> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
>> intuitive, but for JSON arrays -- which don't have any concept of keys;
>> JSON arrays are just ordered lists -- it means "does this array contain any
>> embedded objects with duplicate keys".
>>
>> See:
>>
>> SELECT js,
>> js IS JSON "json?",
>> js IS JSON OBJECT "object?",
>> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
>> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
>> js IS JSON ARRAY "array?",
>> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
>> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
>> FROM (VALUES
>> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
>> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
>> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK
>> ('{"c":2,"d":3}'), -- expect t for ALL object tests
>> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
>> nested objects
>> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
>> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
>> recursively for embedded objecs
>> ) foo(js);
>>
>>
>> A couple of side notes:
>>
>> 1. Your first data example is not JSON at all. It's helpful for this
>> kind of test to include a plain IS JSON column, since any of the IS JSON X
>> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but
>> it's not an X.
>>
>> 2. Curiously, the JSON spec itself is completely silent on the meaning of
>> objects with duplicate keys. PostgreSQL is more helpful in this
>> regard--the docs explicitly state that the last value is the one that is
>> retained by JSONB and used in processing functions.
>>
>>
> To improve the documentation here, I would suggest simply adding the word
> "recursively" after "tested":
>
> If WITH UNIQUE KEYS is specified, then any object in the *expression* is
> also tested recursively to see if it has duplicate keys
>
> As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means
> "without testing for uniqueness" rather than "confirming that non-unique
> keys are present", I will defer to anyone who has more familiarity with the
> PostgreSQL documentation style. I probed around looking at other
> situations where there are phrases that make the default condition explicit
> (e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my
> attempts ended up too wordy for my liking.
>

--
С уважением,
Валерий

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message David G. Johnston 2025-03-21 02:04:18 Re: Ambiguity in IS JSON description and logic
Previous Message Tom Lane 2025-03-20 16:04:19 Re: WHEN SQLSTATE '00000' THEN equals to WHEN OTHERS THEN