Re: ERROR: unsupported Unicode escape sequence - in JSON-type column

From: Jan Bilek <jan(dot)bilek(at)eftlab(dot)com(dot)au>
To: Erik Wienhold <ewie(at)ewie(dot)name>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: ERROR: unsupported Unicode escape sequence - in JSON-type column
Date: 2023-02-28 01:03:10
Message-ID: 63ae2d9c-fc24-1de8-63e2-10bb0b6c6e5a@eftlab.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2/28/23 01:17, Erik Wienhold wrote:
>> On 27/02/2023 13:13 CET Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>>
>> I'd be curious to know how the customer managed to do that.
>> Perhaps there is a loophole in PostgreSQL that needs to be fixed.
> Probably via some data access layer and not directly via Postgres. It's easy
> to reproduce with psycopg:
>
> import psycopg
>
> with psycopg.connect() as con:
> con.execute('create temp table jsontab (jsoncol json)')
> con.execute(
> 'insert into jsontab (jsoncol) values (%s)',
> [psycopg.types.json.Json('\0')],
> )
>
> with con.execute('select jsoncol from jsontab') as cur:
> print(cur.fetchall())
>
> try:
> with con.execute('select jsoncol::jsonb from jsontab') as cur:
> pass
> raise AssertionError("jsonb should fail")
> except psycopg.errors.UntranslatableCharacter:
> pass
>
> Another reason to prefer jsonb over json to reject such inputs right away.
> The documentation states that json does not validate inputs in constrast to
> jsonb.
>
> Of course the OP now has to deal with json. The data can be sanitized by
> replacing all null character escape sequences:
>
> update jsontab
> set jsoncol = replace(jsoncol::text, '\u0000', '')::json
> where strpos(jsoncol::text, '\u0000') > 0;
>
> But the data access layer (or whatever got the json into the database) must be
> fixed as well to reject or sanitize those inputs in the future.
>
> --
> Erik

Hi Erik,

No, it didn't go through any foreign data access layer - it went in
straight through the Postgresql variable bind using pre-cached insert
statement using PostgreSQL 14.5, connected over UNIX sockets.

Strange thing happened afterwards - that locating that record was on &
off - I couldn't pin-point it in DB as it seemed to be failing on
multiple places ... until using that trick from Laurenz. Felt like a
PostgreSQL memory corruption, but system remained stable without any
complaints.

Thanks & Cheers,
Jan

--
Jan Bilek - CTO at EFTlab Pty Ltd.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thorsten Glaser 2023-02-28 01:22:47 Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)
Previous Message Jan Bilek 2023-02-28 00:55:02 Re: ERROR: unsupported Unicode escape sequence - in JSON-type column