From: | Jan Bilek <jan(dot)bilek(at)eftlab(dot)com(dot)au> |
---|---|
To: | Erik Wienhold <ewie(at)ewie(dot)name>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: ERROR: unsupported Unicode escape sequence - in JSON-type column |
Date: | 2023-02-28 01:03:10 |
Message-ID: | 63ae2d9c-fc24-1de8-63e2-10bb0b6c6e5a@eftlab.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2/28/23 01:17, Erik Wienhold wrote:
>> On 27/02/2023 13:13 CET Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>>
>> I'd be curious to know how the customer managed to do that.
>> Perhaps there is a loophole in PostgreSQL that needs to be fixed.
> Probably via some data access layer and not directly via Postgres. It's easy
> to reproduce with psycopg:
>
> import psycopg
>
> with psycopg.connect() as con:
> con.execute('create temp table jsontab (jsoncol json)')
> con.execute(
> 'insert into jsontab (jsoncol) values (%s)',
> [psycopg.types.json.Json('\0')],
> )
>
> with con.execute('select jsoncol from jsontab') as cur:
> print(cur.fetchall())
>
> try:
> with con.execute('select jsoncol::jsonb from jsontab') as cur:
> pass
> raise AssertionError("jsonb should fail")
> except psycopg.errors.UntranslatableCharacter:
> pass
>
> Another reason to prefer jsonb over json to reject such inputs right away.
> The documentation states that json does not validate inputs in constrast to
> jsonb.
>
> Of course the OP now has to deal with json. The data can be sanitized by
> replacing all null character escape sequences:
>
> update jsontab
> set jsoncol = replace(jsoncol::text, '\u0000', '')::json
> where strpos(jsoncol::text, '\u0000') > 0;
>
> But the data access layer (or whatever got the json into the database) must be
> fixed as well to reject or sanitize those inputs in the future.
>
> --
> Erik
Hi Erik,
No, it didn't go through any foreign data access layer - it went in
straight through the Postgresql variable bind using pre-cached insert
statement using PostgreSQL 14.5, connected over UNIX sockets.
Strange thing happened afterwards - that locating that record was on &
off - I couldn't pin-point it in DB as it seemed to be failing on
multiple places ... until using that trick from Laurenz. Felt like a
PostgreSQL memory corruption, but system remained stable without any
complaints.
Thanks & Cheers,
Jan
--
Jan Bilek - CTO at EFTlab Pty Ltd.
From | Date | Subject | |
---|---|---|---|
Next Message | Thorsten Glaser | 2023-02-28 01:22:47 | Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!) |
Previous Message | Jan Bilek | 2023-02-28 00:55:02 | Re: ERROR: unsupported Unicode escape sequence - in JSON-type column |