Re: Q: Escapes in jsonpath Idents

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Q: Escapes in jsonpath Idents
Date: 2024-03-18 00:09:50
Message-ID: 2udxavv5raiz4kgire33io7xr7gtf4d7fwvv6oerqia5gjpwp5@gebfvx6k342j
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-03-17 20:50 +0100, David E. Wheeler wrote:
> On Mar 17, 2024, at 15:12, Erik Wienhold <ewie(at)ewie(dot)name> wrote:
> > So I think it makes sense to reword the entire backslash part of the
> > paragraph and remove references to JSON entirely. The attached patch
> > does that and also formats the backslash escapes as a bulleted list for
> > readability.
>
> Ah, it’s JavaScript format, not JSON! This does clarify things quite
> nicely, thank you. Happy to add my review once it’s in a commit fest.

Thanks. https://commitfest.postgresql.org/48/4899/

> > The first case ($.$foo) is in line with the restriction on member
> > accessors that you quoted first.
>
> Huh, that’s now how I read it. Here it is again:
>
> >> Member accessor that returns an object member with the specified
> >> key. If the key name matches some named variable starting with $ or
> >> does not meet the JavaScript rules for an identifier, it must be
> >> enclosed in double quotes to make it a string literal.
>
>
> Note that in my example `$foo` does not match a variable. I mean it
> looks like a variable, but none is used here. I guess it’s being
> conservative because it might be used in one of the functions, like
> jsonb_path_exists(), to which variables might be passed.

I had the same reasoning while writing my first reply but scrapped that
part because I found it obvious: That jsonpath is parsed before calling
jsonb_path_exists() and therefore the parser has no context about any
variables, which might not even be hardcoded but may result from a
query.

> > The error message 'syntax error at or near "$oo" of jsonpath input' for
> > the second case ($.f$oo), however, looks as if the scanner identifies
> > '$oo' as a variable instead of contiuing the scan of identifier (f$oo)
> > for the member accessor. Looks like a bug to me because a variable
> > doesn't even make sense in that place.
>
> Right. Maybe the docs should be updated to say that a literal dollar
> sign isn’t supported in identifiers, unlike in JavaScript, except
> through escapes like this:

Unfortunately, I don't have access to that part of the SQL spec. So I
don't know how the jsonpath grammar is specified.

I had a look into Oracle, MySQL, and SQLite docs to see what they
implement:

* Oracle requires the unquoted field names to match [A-Za-z][A-Za-z0-9]*
(see "object steps"). It also supports variables.
https://docs.oracle.com/en/database/oracle/oracle-database/23/adjsn/json-path-expressions.html

* MySQL refers to ECMAScript identifiers but does not say anything about
variables: https://dev.mysql.com/doc/refman/8.3/en/json.html#json-path-syntax

* SQLite skimps on details and does not document a grammar:
https://sqlite.org/json1.html#path_arguments
But it looks as if it strives for compatibility with MySQL and our dear
Postgres: https://sqlite.org/src/doc/json-in-core/doc/json-enhancements.md

Also checked git log src/backend/utils/adt/jsonpath_scan.l for some
insights but haven't found any yet.

--
Erik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-18 00:13:33 Re: Support json_errdetail in FRONTEND builds
Previous Message jian he 2024-03-18 00:00:00 Re: Add pg_basetype() function to obtain a DOMAIN base type