Re: [PATCH] WIP: replace method for jsonpath

From: Florents Tselai <florents(dot)tselai(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] WIP: replace method for jsonpath
Date: 2024-09-19 12:57:00
Message-ID: CA+v5N4288cOvp0J3za5gRwqmw3PH18wphhve1M-i3orRzUyDtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 Sep 2024, at 3:47 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

On 2024-09-18 We 4:23 AM, Peter Eisentraut wrote:

On 17.09.24 21:16, David E. Wheeler wrote:

On Sep 17, 2024, at 15:03, Florents Tselai <florents(dot)tselai(at)gmail(dot)com>
wrote:

Fallback scenario: make this an extension, but in a first pass I didn’t
find any convenient hooks.
One has to create a whole new scanner, grammar etc.

Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension"
feature[1], which allows users to add functions. Would be cool to have a
way to register jsonpath functions somehow, but I would imagine it’d need
quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to
see something like that appear in a future version of the spec, with an
interface something like CREATE OPERATOR.

Why can't we "just" call any suitable pg_proc-registered function from JSON
path? The proposed patch routes the example '$.replace("hello","bye")'
internally to the internal implementation of the SQL function replace(...,
'hello', 'bye'). Why can't we do this automatically for any function call
in a JSON path expression?

That might work. The thing that bothers me about the original proposal is
this: what if we add a new non-spec jsonpath method and then a new version
of the spec adds a method with the same name, but not compatible with our
method? We'll be in a nasty place. At the very least I think we need to try
hard to avoid that. Maybe we should prefix non-spec method names with
"pg_", or maybe use an initial capital letter.

If naming is your main reservation, then I take it you’re generally
positive.

Having said that, “pg_” is probably too long for a jsonpath expression,
Most importantly though, “pg_” in my mind is a prefix for things like
catalog lookup and system monitoring.
Not a function that the average user would use.
Thus, I lean towards initial-capital.

The more general case would look like:
A new jsonpath item of the format $.Func(arg1, …, argn) can be applied
(recursively or not) to a json object.

As a first iteration/version only pg_proc-registered functions of the
format func(text, ...,) -> text are available.
We can focus on oid(arg0) = TEXTOID and rettype = TEXTOID fist.
The first arg0 has to be TEXTOID (the current json string) while subsequent
args are provided by the user
in the jsonpath expression.

The functions we want to support will be part of jsonpath grammar
and during execution we'll have enough info to find the appropriate
PGFunction to call.

What I'm missing yet is how we could handle vars jsonb,
in case the user doesn't want to just hardcode the actual function
arguments.
Then resolving argtypes1...n is a bit more complex:

The signature for jsonb_apply(doc jsonb, func text[, variadic "any"
args1_n]); [0]
Here, variadic "any" works beautifully, but that's a brand-new function.

In existing jsonb{path} facilities, vars are jsonb objects which could work
as well I think.
Unless I'm missing something.

[0] https://github.com/Florents-Tselai/jsonb_apply

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message a.imamov 2024-09-19 13:00:30 Re: Custom connstr in background_psql()
Previous Message Laurenz Albe 2024-09-19 12:53:52 Re: Wrong results with equality search using trigram index and non-deterministic collation