From: | David Steele <david(at)pgmasters(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: making the backend's json parser work in frontend code |
Date: | 2020-01-21 22:34:55 |
Message-ID: | 3649c0bb-a4f2-e59b-3322-ea740a179b2f@pgmasters.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Robert,
On 1/17/20 2:33 PM, Robert Haas wrote:
> On Fri, Jan 17, 2020 at 12:36 PM David Steele <david(at)pgmasters(dot)net>
wrote:
>
>> I used the callbacks because that's the first method I found but it
>> seems like json_lex() might be easier to use in practice.
>
> Ugh, really? That doesn't seem like it would be nice at all.
I guess it's a matter of how you want to structure the code.
>> I think it's an issue that the entire string must be passed to the lexer
>> at once. That will not be great for large manifests. However, I don't
>> think it will be all that hard to implement an optional "want more"
>> callback in the lexer so JSON data can be fed in from the file in
chunks.
>
> I thought so initially, but now I'm not so sure. The thing is, you
> actually need all the manifest data in memory at once anyway, or so I
> think. You're essentially doing a "full join" between the contents of
> the manifest and the contents of the file system, so you've got to
> scan one (probably the filesystem) and then mark entries in the other
> (probably the manifest) used as you go.
Yeah, having a copy of the manifest in memory is the easiest way to do
validation, but I think you'd want it in a structured format.
We parse the file part of the manifest into a sorted struct array which
we can then do binary searches on by filename.
>> So, that just leaves ereport() as the largest remaining issue? I'll
>> look at that today and Tuesday and see what I can work up.
>
> PFA my work on that topic. As compared with my previous patch series,
> the previous 0001 is dropped and what are now 0001 and 0002 are the
> same as patches from the previous series. 0003 and 0004 are aiming
> toward getting rid of ereport() and, I believe, show a plausible
> strategy for so doing. There are, possibly, things not to like here,
> and it's certainly incomplete, but I think I kinda like this
> direction. Comments appreciated.
>
> 0003 nukes lex_accept(), inlining the logic into callers. I found that
> the refactoring I wanted to do in 0004 was pretty hard without this,
> and it turns out to save code, so I think this is a good idea
> independently of anything else.
No arguments here.
> 0004 adjusts many functions in jsonapi.c to return a new enumerated
> type, JsonParseErrorType, instead of directly doing ereport(). It adds
> a new function that takes this value and a lexing context and throws
> an error. The JSON_ESCAPING_INVALID case is wrong and involves a gross
> hack, but that's fixable with another field in the lexing context.
> More work is needed to really bring this up to scratch, but the idea
> is to make this code have a soft dependency on ereport() rather than a
> hard one.
My first reaction was that if we migrated ereport() first it would make
this all so much easier. Now I'm no so sure.
Having a general json parser in libcommon that is not tied into a
specific error handling/logging system actually sounds like a really
nice thing to have. If we do migrate ereport() the user would always
have the choice to call throw_a_json_error() if they wanted to.
There's also a bit of de-duplication of error messages, which is nice,
especially in the case JSON_ESCAPING_INVALID. And I agree that this
case can be fixed with another field in the lexer -- or at least so it
seems to me.
Though, throw_a_json_error() is not my favorite name. Perhaps
json_ereport()?
Regards,
--
-David
david(at)pgmasters(dot)net
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2020-01-21 22:37:26 | Re: Psql patch to show access methods info |
Previous Message | Alvaro Herrera | 2020-01-21 22:33:29 | Re: Psql patch to show access methods info |