From: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: [PATCH] json_lex_string: don't overread on bad UTF8 |
Date: | 2024-05-07 21:06:10 |
Message-ID: | CAOYmi+=yCFok+UNRHDJna5dSasqa9cMHviBZ6pYmtt1Yn_RfRg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 6, 2024 at 8:43 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Fri, May 03, 2024 at 07:05:38AM -0700, Jacob Champion wrote:
> > We could port something like that to src/common. IMO that'd be more
> > suited for an actual conversion routine, though, as opposed to a
> > parser that for the most part assumes you didn't lie about the input
> > encoding and is just trying not to crash if you're wrong. Most of the
> > time, the parser just copies bytes between delimiters around and it's
> > up to the caller to handle encodings... the exceptions to that are the
> > \uXXXX escapes and the error handling.
>
> Hmm. That would still leave the backpatch issue at hand, which is
> kind of confusing to leave as it is. Would it be complicated to
> truncate the entire byte sequence in the error message and just give
> up because we cannot do better if the input byte sequence is
> incomplete?
Maybe I've misunderstood, but isn't that what's being done in v2?
> > Maybe I'm missing
> > code somewhere, but I don't see a conversion routine from
> > json_errdetail() to the actual client/locale encoding. (And the parser
> > does not support multibyte input_encodings that contain ASCII in trail
> > bytes.)
>
> Referring to json_lex_string() that does UTF-8 -> ASCII -> give-up in
> its conversion for FRONTEND, I guess? Yep. This limitation looks
> like a problem, especially if plugging that to libpq.
Okay. How we deal with that will likely guide the "optimal" fix to
error reporting, I think...
Thanks,
--Jacob
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-05-07 21:17:02 | Re: New GUC autovacuum_max_threshold ? |
Previous Message | Peter Eisentraut | 2024-05-07 21:04:51 | Re: partitioning and identity column |