From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIP Incremental JSON Parser |
Date: | 2024-02-22 09:38:32 |
Message-ID: | 3efe8333-3285-1d9d-5ad3-0f9784d5c1c4@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2024-02-21 We 15:26, Jacob Champion wrote:
> On Wed, Feb 21, 2024 at 6:50 AM Jacob Champion
> <jacob(dot)champion(at)enterprisedb(dot)com> wrote:
>> On Tue, Feb 20, 2024 at 9:32 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> *sigh* That's weird. I wonder why you can reproduce it and I can't. Can
>>> you give me details of the build? OS, compiler, path to source, build
>>> setup etc.? Anything that might be remotely relevant.
> This construction seems suspect, in json_lex_number():
>
>> if (lex->incremental && !lex->inc_state->is_last_chunk &&
>> len >= lex->input_length)
>> {
>> appendStringInfoString(&lex->inc_state->partial_token,
>> lex->token_start);
>> return JSON_INCOMPLETE;
>> }
> appendStringInfoString() isn't respecting the end of the chunk: if
> there's extra data after the chunk boundary (as
> AppendIncrementalManifestData() does) then all of that will be stuck
> onto the end of the partial_token.
>
> I'm about to context-switch off of this for the day, but I can work on
> a patch tomorrow if that'd be helpful. It looks like this is not the
> only call to appendStringInfoString().
>
Yeah, the issue seems to be with chunks of json that are not
null-terminated. We don't require that they be so this code was buggy.
It wasn't picked up earlier because the tests that I wrote did put a
null byte at the end. Patch 5 in this series fixes those issues and
adjusts most of the tests to add some trailing junk to the pieces of
json, so we can be sure that this is done right.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v8-0001-Introduce-a-non-recursive-JSON-parser.patch | text/x-patch | 421.1 KB |
v8-0002-Add-support-for-incrementally-parsing-backup-mani.patch | text/x-patch | 7.6 KB |
v8-0003-Use-incremental-parsing-of-backup-manifests.patch | text/x-patch | 11.5 KB |
v8-0004-add-logging-traces-to-see-if-we-can-find-out-what.patch | text/x-patch | 3.3 KB |
v8-0005-fixes-for-non-null-terminated-inputs-for-incremen.patch | text/x-patch | 6.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Sutou Kouhei | 2024-02-22 09:39:48 | Re: Make COPY format extendable: Extract COPY TO format implementations |
Previous Message | Daniel Gustafsson | 2024-02-22 09:33:04 | Re: Test to dump and restore objects left behind by regression |