From: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WIP Incremental JSON Parser |
Date: | 2024-03-14 19:35:29 |
Message-ID: | CAOYmi+kgckFsBNOBsrekDipq88G56pVndVXKcA7qMWYx9feChQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've been poking at the partial token logic. The json_errdetail() bug
mentioned upthread (e.g. for an invalid input `[12zz]` and small chunk
size) seems to be due to the disconnection between the "main" lex
instance and the dummy_lex that's created from it. The dummy_lex
contains all the information about the failed token, which is
discarded upon an error return:
> partial_result = json_lex(&dummy_lex);
> if (partial_result != JSON_SUCCESS)
> return partial_result;
In these situations, there's an additional logical error:
lex->token_start is pointing to a spot in the string after
lex->token_terminator, which breaks an invariant that will mess up
later pointer math. Nothing appears to be setting lex->token_start to
point into the partial token buffer until _after_ the partial token is
successfully lexed, which doesn't seem right -- in addition to the
pointer math problems, if a previous chunk was freed (or on a stale
stack frame), lex->token_start will still be pointing off into space.
Similarly, wherever we set token_terminator, we need to know that
token_start is pointing into the same buffer.
Determining the end of a token is now done in two separate places
between the partial- and full-lexer code paths, which is giving me a
little heartburn. I'm concerned that those could drift apart, and if
the two disagree on where to end a token, we could lose data into the
partial token buffer in a way that would be really hard to debug. Is
there a way to combine them?
Thanks,
--Jacob
From | Date | Subject | |
---|---|---|---|
Next Message | Amonson, Paul D | 2024-03-14 19:50:46 | RE: Popcount optimization using AVX512 |
Previous Message | Robert Haas | 2024-03-14 19:23:48 | Re: Possibility to disable `ALTER SYSTEM` |