From: | Reece Pegues <RPegues(at)tripwire(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table |
Date: | 2016-03-22 00:47:23 |
Message-ID: | D3160E38.2CF84%rpegues@tripwire.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I see, thanks Tom!
On 3/21/16, 5:04 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>Reece Pegues <RPegues(at)tripwire(dot)com> writes:
>> Looks like the database is created with ENCODING = 'SQL_ASCII'
>
>Basically what that does is defeats all encoding checks inside the
>backend; it'll store whatever bytes you give it. So yeah, substring()
>is expected to deal in bytes not characters in this encoding.
>
>> So I assume it was thus saving the data that way, and then if the client
>> encoding is utf8 it tried to encode to that and failed?
>
>If client declares its encoding, the backend will verify correct encoding
>before transmitting data; but if the database encoding is SQL_ASCII then
>no actual conversion happens, only a validity check at transmit/receive.
>
> regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2016-03-22 01:09:53 | Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5) |
Previous Message | Peter Geoghegan | 2016-03-22 00:44:41 | Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5) |