Re: SQL/JSON query functions context_item doc entry and type requirement

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Chapman Flack <jcflack(at)acm(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL/JSON query functions context_item doc entry and type requirement
Date: 2024-06-21 00:46:44
Message-ID: CAKFQuwbYBvUZasGj_ZnfXhC2kk4AT=epwGkNd2=RMMVXkfTNMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 20, 2024 at 9:01 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:

> On Thu, Jun 20, 2024 at 5:46 PM Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> >
> > On Thu, Jun 20, 2024 at 1:03 AM David G. Johnston
> > <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> > > On Wed, Jun 19, 2024 at 8:29 AM jian he <jian(dot)universality(at)gmail(dot)com>
> wrote:
> > >>
> > >> On Mon, Jun 17, 2024 at 9:05 PM Chapman Flack <jcflack(at)acm(dot)org>
> wrote:
> > >> >
> > >> > Hi,
> > >> >
> > >> > On 06/17/24 02:43, Amit Langote wrote:
> > >> > > <replaceable>context_item</replaceable> expression can be a value
> of
> > >> > > any type that can be cast to <type>jsonb</type>. This includes
> types
> > >> > > such as <type>char</type>, <type>text</type>,
> <type>bpchar</type>,
> > >> > > <type>character varying</type>, and <type>bytea</type> (with
> > >> > > <code>ENCODING UTF8</code>), as well as any domains over these
> types.
> > >> >
> > >> > Reading this message in conjunction with [0] makes me think that we
> are
> > >> > really talking about a function that takes a first parameter of
> type jsonb,
> > >> > and behaves exactly that way (so any cast required is applied by
> the system
> > >> > ahead of the call). Under those conditions, this seems like an
> unusual
> > >> > sentence to add in the docs, at least until we have also documented
> that
> > >> > tan's argument can be of any type that can be cast to double
> precision.
> > >> >
> > >>
> > >> I guess it would be fine to add an unusual sentence to the docs.
> > >>
> > >> imagine a function: array_avg(anyarray) returns anyelement.
> > >> array_avg calculate an array's elements's avg. like
> > >> array('{1,2,3}'::int[]) returns 2.
> > >> but array_avg won't make sense if the input argument is a date array.
> > >> so mentioning in the doc: array_avg can accept anyarray, but anyarray
> > >> cannot date array.
> > >> seems ok.
> > >
> > >
> > > There is existing wording for this:
> > >
> > > "The expression can be of any JSON type, any character string type, or
> bytea in UTF8 encoding."
> > >
> > > If you add this sentence to the paragraph the link that already
> exists, which simply points the reader to this sentence, becomes redundant
> and should be removed.
> >
> > I've just posted a patch in the other thread [1] to restrict
> > context_item to be of jsonb type, which users would need to ensure by
> > adding an explicit cast if needed. I think that makes this
> > clarification unnecessary.
> >
> > > As for table 9.16.3 - it is unwieldy already. Lets try and make the
> core syntax shorter, not longer. We already have precedence in the
> subsequent json_table section - give each major clause item a name then
> below the table define the syntax and meaning for those names. Unlike in
> that section - which probably should be modified too - context_item should
> have its own description line.
> >
> > I had posted a patch a little while ago at [1] to render the syntax a
> > bit differently with each function getting its own syntax synopsis.
> > Resending it here; have addressed Jian He's comments.
> >
> > --
>

I was thinking more like:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c324906b22..b9d157663a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18692,8 +18692,10 @@ $.* ? (@ like_regex "^\\d+$")
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_exists</primary></indexterm>
<function>json_exists</function> (
- <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable> <optional>
<literal>PASSING</literal> { <replaceable>value</replaceable>
<literal>AS</literal> <replaceable>varname</replaceable> } <optional>,
...</optional></optional>
- <optional> { <literal>TRUE</literal> | <literal>FALSE</literal>
|<literal> UNKNOWN</literal> | <literal>ERROR</literal> } <literal>ON
ERROR</literal> </optional>)
+ <replaceable>context_item</replaceable>,
+ <replaceable>path_expression</replaceable>
+ <optional>variable_definitions</optional>
+ <optional>on_error_boolean</optional>)
</para>
<para>
Returns true if the SQL/JSON
<replaceable>path_expression</replaceable>
@@ -18732,12 +18734,14 @@ ERROR: jsonpath array subscript is out of bounds
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_query</primary></indexterm>
<function>json_query</function> (
- <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable> <optional>
<literal>PASSING</literal> { <replaceable>value</replaceable>
<literal>AS</literal> <replaceable>varname</replaceable> } <optional>,
...</optional></optional>
- <optional> <literal>RETURNING</literal>
<replaceable>data_type</replaceable> <optional> <literal>FORMAT
JSON</literal> <optional> <literal>ENCODING UTF8</literal> </optional>
</optional> </optional>
- <optional> { <literal>WITHOUT</literal> | <literal>WITH</literal>
{ <literal>CONDITIONAL</literal> |
<optional><literal>UNCONDITIONAL</literal></optional> } } <optional>
<literal>ARRAY</literal> </optional> <literal>WRAPPER</literal> </optional>
- <optional> { <literal>KEEP</literal> | <literal>OMIT</literal> }
<literal>QUOTES</literal> <optional> <literal>ON SCALAR STRING</literal>
</optional> </optional>
- <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional>
| <literal>OBJECT</literal> } | <literal>DEFAULT</literal>
<replaceable>expression</replaceable> } <literal>ON EMPTY</literal>
</optional>
- <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>EMPTY</literal> { <optional> <literal>ARRAY</literal> </optional>
| <literal>OBJECT</literal> } | <literal>DEFAULT</literal>
<replaceable>expression</replaceable> } <literal>ON ERROR</literal>
</optional>)
+ <replaceable>context_item</replaceable>,
+ <replaceable>path_expression</replaceable>
+ <optional>variable_definitions</optional>
+ <optional>return_clause</optional>
+ <optional>wrapping_clause</optional>
+ <optional>quoting_clause</optional>
+ <optional>on_empty_set</optional>
+ <optional>on_error_set</optional>)
</para>
<para>
Returns the result of applying the SQL/JSON
@@ -18809,11 +18813,12 @@ DETAIL: Missing "]" after array dimensions.
<entry role="func_table_entry"><para role="func_signature">
<indexterm><primary>json_value</primary></indexterm>
<function>json_value</function> (
- <replaceable>context_item</replaceable>,
<replaceable>path_expression</replaceable>
- <optional> <literal>PASSING</literal> {
<replaceable>value</replaceable> <literal>AS</literal>
<replaceable>varname</replaceable> } <optional>, ...</optional></optional>
- <optional> <literal>RETURNING</literal>
<replaceable>data_type</replaceable> </optional>
- <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>DEFAULT</literal> <replaceable>expression</replaceable> }
<literal>ON EMPTY</literal> </optional>
- <optional> { <literal>ERROR</literal> | <literal>NULL</literal> |
<literal>DEFAULT</literal> <replaceable>expression</replaceable> }
<literal>ON ERROR</literal> </optional>)
+ <replaceable>context_item</replaceable>,
+ <replaceable>path_expression</replaceable>
+ <optional>variable_definitions</optional>
+ <optional>return_type</optional>
+ <optional>on_empty_value</optional>
+ <optional>on_error_value</optional>)
</para>
<para>
Returns the result of applying the SQL/JSON

Then defining each of those below the table - keeping the on_error variants
together.

> playing around with it.
> found some minor issues:
>
> json_exists allow: DEFAULT expression ON ERROR, which is not
> mentioned in the doc.
> for example:
> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default
> true ON ERROR);
> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON
> ERROR);
> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON
> ERROR);
>

Yeah, surprised it works, the documented behavior seems logical. Being
able to return a non-boolean here seems odd. Especially since it is cast
to boolean on output.

> JSON_VALUE on error, on empty semantics should be the same as json_query.
> like:
> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
> ON EMPTY ]
> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
> ON ERROR ])
>
> examples:
> select JSON_value(jsonb '[]' , '$' empty array on error);
> select JSON_value(jsonb '[]' , '$' empty object on error);
>

Again the documented behavior seems to make sense though and the ability to
specify empty in the value function seems like a bug. If you really want
an empty array or object you do have access to default. The reason
json_query provides for an empty array/object is that it is already
expecting to produce an array (object seems a little odd).

I agree our docs and code do not match which needs to be fixed, ideally in
the direction of the standard which I'm guessing our documentation is based
off of. But let's not go off of my guess.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2024-06-21 01:00:48 Re: ON ERROR in json_query and the like
Previous Message Amit Langote 2024-06-21 00:22:36 Re: ON ERROR in json_query and the like