From: | Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Nikita Glukhov <glukhov(dot)n(dot)a(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com> |
Subject: | Re: SQL:2023 JSON simplified accessor support |
Date: | 2025-02-05 07:20:30 |
Message-ID: | CAK98qZ3Ly6PhRwCVmMKJBba5oHVF9k370MMT2b_gep-SuQfRtg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
On Tue, Nov 26, 2024 at 3:12 AM Peter Eisentraut <peter(at)eisentraut(dot)org>
wrote:
> On 21.11.24 23:46, Andrew Dunstan wrote:
> >> Questions:
> >>
> >> 1. Since Nikita’s patches did not address the JSON data type, and JSON
> >> currently does not support subscripting, should we limit the initial
> >> feature set to JSONB dot-notation for now? In other words, if we aim
> >> to fully support JSON simplified accessors for the plain JSON type,
> >> should we handle support for plain JSON subscripting as a follow-up
> >> effort?
> >>
> >> 2. I have yet to have a more thorough review of Nikita’s patches.
> >> One area I am not familiar with is the hstore-related changes. How
> >> relevant is hstore to the JSON simplified accessor?
> >>
> >
> > We can't change the way the "->" operator works, as there could well be
> > uses of it in the field that rely on its current behaviour. But maybe we
> > could invent a new operator which is compliant with the standard
> > semantics for dot access, and call that. Then we'd get the best
> > performance, and also we might be able to implement it for the plain
> > JSON type. If that proves not possible we can think about not
> > implementing for plain JSON, but I'd rather not go there until we have
> to.
>
> Yes, I think writing a custom operator that is similar to "->" but has
> the required semantics is the best way forward. (Maybe it can be just a
> function?)
>
> > I don't think we should be including hstore changes here - we should
> > just be aiming at implementing the standard for JSON access. hstore
> > changes if any should be a separate feature. The aren't relevant to JSON
> > access, although they might use some of the same infrastructure,
> > depending on implementation.
>
> In a future version, the operator/function mentioned above could be a
> catalogued property of a type, similar to typsubscript. Then you could
> also apply this to other types. But let's leave that for later.
>
> If I understand it correctly, Nikita's patch uses the typsubscript
> support function to handle both bracket subscripting and dot notation.
> I'm not sure if it's right to mix these two together. Maybe I didn't
> understand that correctly.
>
I’ve been working on a custom operator-like function to support dot
notation in lax mode for JSONB. However, I realized this approach has
the following drawbacks:
1. Handling both dot notation and bracket subscripting together
becomes complicated, as we still need to consider jsonb’s existing
type subscript functions.
2. Chaining N dot-access operators causes multiple unnecessary
deserialization/serialization cycles: for each operator call, the source
jsonb binary is converted to an in-memory JsonbValue, then the
relevant field is extracted, and finally it’s turned back into a
binary jsonb object. This hurts performance. A direct use of the
jsonpath functions API seems more efficient.
3. Correctly applying lax mode requires different handling for the
first, middle, and last operators, which adds further complexity.
Because of these issues, I took a closer look at Nikita’s patch. His
solution generalizes the existing jsonb typesubscript support function
to handle both bracket subscripting and dot notation. It achieves this
by translating dot notation into a jsonpath expression during
transformation, and then calls JsonPathQuery at execution.
Overall, I find this approach more efficient for chained accessors and
more flexible for future enhancements.
I attached a minimized version of Nikita’s patch (v7):
- The first three patches are refactoring steps that could be squashed
if preferred.
- The last two patches implement dot notation and wildcard access,
respectively.
Changes in this new version:
- Removed code handling hstore, as Andrew pointed out it isn’t
directly relevant to JSON access and should be handled separately.
- Split tests for dot notation and wildcard access.
- Dropped the two patches in v6 that enabled non-parenthesized column
references (per Nikita’s suggestion, this will need its own separate
discussion).
For reference, I’ve also attached the operator-like function approach
in 0001-WIP-Operator-approach-JSONB-dot-notation.txt.
I’d appreciate any feedback and thoughts!
Best,
Alex
Attachment | Content-Type | Size |
---|---|---|
0001-WIP-Operator-apporach-JSONB-dot-notation.txt | text/plain | 20.9 KB |
v7-0001-Allow-transformation-only-of-a-sublist-of-subscri.patch | application/octet-stream | 6.3 KB |
v7-0004-Implement-read-only-dot-notation-for-jsonb-using-.patch | application/octet-stream | 31.0 KB |
v7-0005-Allow-processing-of-.-by-generic-subscripting.patch | application/octet-stream | 18.0 KB |
v7-0003-Export-jsonPathFromParseResult.patch | application/octet-stream | 2.4 KB |
v7-0002-Pass-field-accessors-to-generic-subscripting.patch | application/octet-stream | 15.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2025-02-05 07:28:09 | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Previous Message | Alexander Korotkov | 2025-02-05 06:59:40 | Re: Vacuum statistics |