Re: ❓ JSON Path Dot Precedence

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ❓ JSON Path Dot Precedence
Date: 2024-04-07 19:46:58
Message-ID: o2f7ccz6dywd7tqgbighi2t3waptrnklxzv54ntiq4l3kh4xpu@gwfzpeql2rb3
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-04-07 18:13 +0200, David E. Wheeler wrote:
> A question about the behavior of the JSON Path parser. The docs[1]
> have this to say about numbers:
>
> > Numeric literals in SQL/JSON path expressions follow JavaScript
> > rules, which are different from both SQL and JSON in some minor
> > details. For example, SQL/JSON path allows .1 and 1., which are
> > invalid in JSON.
>
> In other words, this is valid:
>
> david=# select '2.'::jsonpath;
> jsonpath
> ----------
> 2
>
> But this feature creates a bit of a conflict with the use of a dot for
> path expressions. Consider `0x2.p10`. How should that be parsed? As an
> invalid decimal expression ("trailing junk after numeric literal”), or
> as a valid integer 2 followed by the path segment “p10”? Here’s the
> parser’s answer:
>
> david=# select '0x2.p10'::jsonpath;
> jsonpath
> -----------
> (2)."p10"
>
> So it would seem that, other things being equal, a path key expression
> (`.foo`) is slightly higher precedence than a decimal expression. Is
> that intentional/correct?

I guess jsonpath assumes that hex, octal, and binary literals are
integers. So there's no ambiguity about any fractional part that might
follow.

> Discovered while writing my Go lexer and throwing all of Go’s floating
> point literal examples[2] at it and comparing to the Postgres path
> parser. Curiously, this is only an issue for 0x/0o/0b numeric
> expressions; a decimal expression does not behave in the same way:
>
> david=# select '2.p10'::jsonpath;
> ERROR: trailing junk after numeric literal at or near "2.p" of jsonpath input
> LINE 1: select '2.p10'::jsonpath;

It scans the decimal "2." and then finds junks "p10".

Works with a full decimal:

test=# select '3.14.p10'::jsonpath;
jsonpath
--------------
(3.14)."p10"
(1 row)

And with extra whitespace to resolve the ambiguity:

test=# select '2 .p10'::jsonpath;
jsonpath
-----------
(2)."p10"
(1 row)

> Which maybe seems a bit inconsistent.
>
> Thoughts on what the “correct” behavior should be?

I'd say a member accessor after a number doesn't really make sense
because object keys are strings. One could argue that path "$.2.p10"
should match JSON '{"2":{"p10":42}}', i.e. the numeric accessor is
converted to a string. For example, in nodejs I can do:

> var x = {2: {p10: 42}}
> x[2].p10
42

But that's JavaScript, not JSON.

Also, is there even a use case for path "0x2.p10"? The path has to
start with "$" or ("@" in case of a filter expression), doesn't it? And
it that case it doesn't parse:

test=# select '$.0x2.p10'::jsonpath;
ERROR: trailing junk after numeric literal at or near ".0x" of jsonpath input
LINE 1: select '$.0x2.p10'::jsonpath;

Even with extra whitespace:

test=# select '$ . 0x2 . p10'::jsonpath;
ERROR: syntax error at or near "0x2" of jsonpath input
LINE 1: select '$ . 0x2 . p10'::jsonpath;

Or should it behave like an array accessor? Similar to:

test=# select jsonb_path_query('[0,1,{"p10":42},3]', '$[0x2].p10'::jsonpath);
jsonb_path_query
------------------
42
(1 row)

> [1]: https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH
> [2]: https://tip.golang.org/ref/spec#Floating-point_literals

--
Erik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-04-07 19:47:00 Re: LogwrtResult contended spinlock
Previous Message Tom Lane 2024-04-07 19:38:44 Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document