From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: [PATCH] json_lex_string: don't overread on bad UTF8 |
Date: | 2024-05-09 04:27:08 |
Message-ID: | ZjxQnOD1OoCkEeMN@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 08, 2024 at 07:01:08AM -0700, Jacob Champion wrote:
> On Tue, May 7, 2024 at 10:31 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> But looking closer, I can see that in the JSON_INVALID_TOKEN case,
>> when !tok_done, we set token_terminator to point to the end of the
>> token, and that would include an incomplete byte sequence like in your
>> case. :/
>
> Ah, I see what you're saying. Yeah, that approach would need some more
> invasive changes.
My first feeling was actually to do that, and report the location in
the input string where we are seeing issues. All code paths playing
with token_terminator would need to track that.
> Agreed. Fortunately (or unfortunately?) I think the JSON
> client-encoding work is now a prerequisite for OAuth in libpq, so
> hopefully some improvements can fall out of that work too.
I'm afraid so. I don't quite see how this would be OK to tweak on
stable branches, but all areas that could report error states with
partial byte sequence contents would benefit from such a change.
>> Thoughts and/or objections?
>
> None here.
This is a bit mitigated by the fact that d6607016c738 is recent, but
this is incorrect since v13 so backpatched down to that.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2024-05-09 04:44:47 | Re: First draft of PG 17 release notes |
Previous Message | Paul Jungwirth | 2024-05-09 04:24:09 | Re: SQL:2011 application time |