Re: making the backend's json parser work in frontend code

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making the backend's json parser work in frontend code
Date: 2020-01-23 03:00:35
Message-ID: AD733159-7FD4-42EF-B3F5-252964D37FFF@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 22, 2020, at 12:11 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> On 2020-Jan-22, Robert Haas wrote:
>
>> On Wed, Jan 22, 2020 at 2:26 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>>> I'm not sure I see the point of keeping json.h split from jsonapi.h. It
>>> seems to me that you could move back all the contents from jsonapi.h
>>> into json.h, and everything would work just as well. (Evidently the
>>> Datum in JsonEncodeDateTime's proto is problematic ... perhaps putting
>>> that prototype in jsonfuncs.h would work.)
>>>
>>> I don't really object to your 0001 patch as posted, though.
>>
>> The goal is to make it possible to use the JSON parser in the
>> frontend, and we can't do that if the header files that would have to
>> be included on the frontend side rely on things that only work in the
>> backend. As written, the patch series leaves json.h with a dependency
>> on Datum, so the stuff that it leaves in jsonapi.h (which is intended
>> to be the header that gets moved to src/common and included by
>> frontend code) can't be merged with it.
>
> Right, I agree with that goal, and as I said, I don't object to your
> patch as posted.
>
>> Now, we could obviously rearrange that. I don't think any of the file
>> naming here is great. But I think we probably want, as far as
>> possible, for the code in FOO.c to correspond to the prototypes in
>> FOO.h. What I'm thinking we should work towards is:
>>
>> json.c/h - support for the 'json' data type
>> jsonb.c/h - support for the 'jsonb' data type
>> jsonfuncs.c/h - backend code that doesn't fit in either of the above
>> jsonapi.c/h - lexing/parsing code that can be used in either the
>> frontend or the backend
>
> ... it would probably require more work to make this 100% attainable,
> but I don't really care all that much.
>
>> I'm not wedded to that. It just looks like the most natural thing from
>> where we are now.
>
> Let's go with it.

I have this done in my local repo to the point that I can build frontend tools against the json parser that is now in src/common and also run all the check-world tests without failure. I’m planning to post my work soon, possibly tonight if I don’t run out of time, but more likely tomorrow.

The main issue remaining is that my repo has a lot of stuff organized differently than Robert’s patches, so I’m trying to turn my code into a simple extension of his work rather than having my implementation compete against his.

For the curious, the code as Robert left it still relies on the DatabaseEncoding though the use of GetDatabaseEncoding, pg_mblen, and similar, and that has been changed in my patches to only rely on the database encoding in the backend, with the code in src/common taking an explicit encoding, which the backend gets in the usual way and the frontend might get with PQenv2encoding() or whatever the frontend programmer finds appropriate. Hopefully, this addresses Robert’s concern upthread about the filesystem name not necessarily being in utf8 format, though I might be misunderstanding the exact thrust of his concern. I can think of other possible interpretations of his concern as he expressed it, so I’ll wait for him to clarify.

For those who want a sneak peak, I’m attaching WIP patches to this email with all my changes, with Robert’s changes partially manually cherry-picked and the rest still unmerged. *THESE ARE NOT MEANT FOR COMMIT. THIS IS FOR ADVISORY PURPOSES ONLY.*. I have some debugging cruft left in here, too, like gcc __attribute__ stuff that won’t be in the patches I submit.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Move-pg_wchar-from-src-include-mb-to-src-include-com.patch.WIP application/octet-stream 60.1 KB
0002-Moving-backend-only-functions-from-pg_wchar.h-into-m.patch.WIP application/octet-stream 51.5 KB
0003-Moving-pg_wchar.h-declared-functions-from-backend-to.patch.WIP application/octet-stream 2.2 KB
0004-Moving-jsonapi.h-to-src-include-common.patch.WIP application/octet-stream 5.4 KB
0005-Moving-backend-functions-out-of-src-include-common-j.patch.WIP application/octet-stream 5.6 KB
0006-Moving-common-functions-into-jsonapi.c.patch.WIP application/octet-stream 5.5 KB
0007-Moving-json-parsing-logic-into-common.patch.WIP application/octet-stream 5.4 KB
0008-Remove-json.c-s-lex_accept.patch.WIP application/octet-stream 6.0 KB
0009-Making-json-parsing-work-without-throwing-exceptions.patch.WIP application/octet-stream 43.9 KB
0010-Moving-json-parsing-into-src-common.patch.WIP application/octet-stream 55.4 KB
0011-Adding-src-bin-pg_test_json.patch.WIP application/octet-stream 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2020-01-23 03:17:22 Re: Allow to_date() and to_timestamp() to accept localized names
Previous Message Alexander Korotkov 2020-01-23 02:41:37 Re: Improve search for missing parent downlinks in amcheck