Re: Doc Rework: Section 9.16.13 SQL/JSON Query Functions

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Doc Rework: Section 9.16.13 SQL/JSON Query Functions
Date: 2024-07-05 12:35:32
Message-ID: CA+HiwqGZ+X9Po05aEUOck4hW6mGx2SGtE8ddTa-76eZoyr1gFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jian,

Thanks for the reviews.

On Wed, Jul 3, 2024 at 11:15 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> Overall, I found this approach makes the synopsis scattered, it's not
> easy to see the full picture.
> for example:
> ```
> JSON_VALUE ( context_item, path_expression [variable_definitions]
> [return_type] [on_empty_value] [on_error_value]) → { text |
> return_data_type }
> ```
> this way it is not easy to find out that RETURNING is a keyword.
> Currently in master, we can quickly see RETURNING is the keyword, the
> master is kind of condense, though.
> but if you are insistent with your approach, then that is fine for me.

Actually, on second thought, I too am finding the new structure
whereby descriptions of various clauses are moved below the table a
bit hard to follow. Descriptions in the table have to use forward
references to bits outside the table and vice versa. Like you, I also
don't like the style used in the new structure for describing various
ON ERROR values that differ based on the function. It seems better to
just list the syntax in the table and then each function's syntax
synopsis tells what values are appropriate to return for that given
function ON ERROR.

So I've decided to resurrect the *other* documentation rewrite patch
that you reviewed back in May.

> "context_item can be a JSON document passed as a value of type json,
> jsonb document, a character or an UTF8- endoded bytea string."
> is wrong?
> e.g. SELECT JSON_EXISTS( NULL::bytea, 'lax $.a[5]' ERROR ON ERROR)

Oops, I was thinking of what can be used in the RETURNING clause.

> check following query:
> select oid, typtype , typname from pg_type where typcategory = 'S';
>
> I think a more accurate description would be:
> "context_item must be a JSON document passed as a value of type json,
> jsonb document, a character string type(text, name, bpchar, varchar)"
> do we need to mention domain over these types?

I don't feel the need to mention every possible type, so I went with:

+ <replaceable>context_item</replaceable> can be any character string that
+ can be succesfully cast to <type>jsonb</type>.

> -------------------------------------
> JSON_EXISTS
> Returns true if the SQL/JSON path_expression possibly referencing the
> variables in variable_definitions applied to the context_item yields
> any items.
> I am not native English speaker, so I found it hard to comprehend.
> I can understand it like:
> "Returns true if the SQL/JSON path_expression (possibly referencing
> the variables in variable_definitions) applied to the context_item
> yields any items."
>
> maybe we can write it into two sentences, or
> "Returns true if the SQL/JSON path_expression applied to the
> context_item yields any items."
> because you already mentioned "path_expression can also contain
> variables whose values are specified using the variable_definitions
> clause described below." in the top level.

Please check the attached patch which contains different text.

> -------------------------------------
> The JSON_QUERY and JSON_VALUE functions are polymorphic in their
> output type with the returning_clause clause dictating what that type
> is.
> how about
> The JSON_QUERY and JSON_VALUE functions output type can be vary, using
> returning_clause specify the desired data type.

Ditto.

> -------------------------------------
> your doc: JSON_VALUE "If path_expression points to a JSON null,
> JSON_VALUE returns a SQL NULL."
> `SELECT JSON_VALUE(jsonb 'null', '$');` here, the path_expression
> points to '$' which is not json null?
> so i like to change it to
> "If the extracted value is a JSON null, an SQL NULL value will return."

I've added a <note> at the bottom:

+ <note>
+ <para>
+ <function>JSON_VALUE()</function> returns SQL NULL if
+ <replaceable>path_expression</replaceable> returns a JSON
+ <literal>null</literal>, whereas <function>JSON_QUERY()</function> returns
+ the JSON <literal>null</literal> as is.
+ </para>
+ </note>

> -------------------------------------
> inconsistency:
> JSON_QUERY: <returnvalue></returnvalue> { <type>jsonb</type> |
> <replaceable>return_data_type</replaceable> }
> JSON_VALUE: <returnvalue></returnvalue> { <type>text</type> |
> <varname>return_data_type</varname> }

No longer in the patch.

> -------------------------------------
> {{For JSON_EXISTS (... on_error_boolean), alternative can be: ERROR,
> UNKNOWN, TRUE, FALSE.
> For JSON_QUERY (... on_error_set on_empty_set), alternative can be:
> ERROR, NULL, EMPTY ARRAY, EMPTY OBJECT, or DEFAULT followed by an
> expression.
> For JSON_VALUE (... on_error_set on_empty_set), alternative can be:
> ERROR, NULL, or DEFAULT followed by an expression.
> }}
> i am not sure what does there dot means here, in the synopsis section,
> three dots is significant.
> Also if I understand it correctly, JSON_EXISTS can only have on_error,
> then I am more confused with ``JSON_EXISTS (... on_error_boolean)``

Ditto.

> still based on v2-0001, v2-0002.
> picture attached.
>
> as you can see from the curly braces,
> ```
> { KEEP | OMIT } QUOTES [ ON SCALAR STRING ]
> ```
> we must choose one, "KEEP" or "OMIT".
>
> but the wrapping_clause:
> ```
> WITHOUT [ARRAY] WRAPPER
> WITH [UNCONDITIONAL] [ARRAY] WRAPPER
> WITH CONDITIONAL [ARRAY] WRAPPER
> ```
> this way, didn't say we must choose between one in these three.

Ditto.

> -----------
> on_error_boolean
> on_error_set
> on_error_value
> on_empty_set
> on_empty_value
>
> alternative ON { ERROR | EMPTY }
>
> ````
> didn't explain on_error_value, on_empty_value.
> why not just on_error_clause, on_empty_clause?

Ditto.

> -------
> <<<quoted paragraph
> When JSON_QUERY function produces multiple JSON values, they are
> returned as a JSON array. By default, the result values are
> unconditionally wrapped even if the array contains only one element.
> You can specify the WITH CONDITIONAL variant to say that the wrapper
> be added only when there are multiple values in the resulting array.
> Or specify the WITHOUT variant to say that the wrapper be removed when
> there is only one element, but it is ignored if there are multiple
> values.
> <<<quoted paragraph
>
> The above paragraph didn't explicitly mention that UNCONDITIONAL is the default.
> BTW, by comparing patch with master, I found out:
>
> """
> If the wrapper is UNCONDITIONAL, an array wrapper will always be
> applied, even if the returned value is already a single JSON object or
> an array. If it is CONDITIONAL, it will not be applied to a single
> JSON object or an array. UNCONDITIONAL is the default.
> """
> this description seems not right.
> if "UNCONDITIONAL is the default", then
> select json_query(jsonb '{"a": [1]}', 'lax $.a' with unconditional
> array wrapper);
> should be same as
> select json_query(jsonb '{"a": [1]}', 'lax $.a' );
>
> another two examples with SQL/JSON scalar item:
>
> select json_query(jsonb '{"a": 1}', 'lax $.a' );
> select json_query(jsonb '{"a": 1}', 'lax $.a' with unconditional wrapper);
>
> Am I interpreting "UNCONDITIONAL is the default" the wrong way?

Current text is confusing, so I've rewritten the paragraph as:

+ If the path expression may return multiple values, it might
be necessary
+ to wrap those values using the <literal>WITH
WRAPPER</literal> clause to
+ make it a valid JSON string, because the default behavior is
to not wrap
+ them, as if <literal>WITHOUT WRAPPER</literal> were specified. The
+ <literal>WITH WRAPPER</literal> clause is by default taken to mean
+ <literal>WITH UNCONDITIONAL WRAPPER</literal>, which means that even a
+ single result value will be wrapped. To apply the wrapper only when
+ multiple values are present, specify <literal>WITH
CONDITIONAL WRAPPER</literal>.
+ Note that an error will be thrown if multiple values are returned and
+ <literal>WITHOUT WRAPPER</literal> is specified.

So, UNCONDITIONAL is the default as in WITH [UNCONDITIONAL] WRAPPER.
(The default when no wrapping clause is present is WITHOUT WRAPPER as
seen in your example).

Please check the attached. I've also added <itemizedlist> lists as I
remember you had proposed before to make the functions' descriptions a
bit more readable -- I'm persuaded. :-)

--
Thanks, Amit Langote

Attachment Content-Type Size
v3-0001-SQL-JSON-Various-improvements-to-SQL-JSON-query-f.patch application/octet-stream 21.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-07-05 12:53:40 Re: Problem while installing PostgreSQL using make
Previous Message Andrey M. Borodin 2024-07-05 12:27:55 Re: Amcheck verification of GiST and GIN