RE: inserts bypass encoding conversion

From: "James Pang (chaolpan)" <chaolpan(at)cisco(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-admin(at)lists(dot)postgresql(dot)org" <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: RE: inserts bypass encoding conversion
Date: 2023-08-17 02:25:57
Message-ID: PH0PR11MB51912A70CBC6FF376F8B106CD61AA@PH0PR11MB5191.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

So, insert into values(chr(226)||chr(128)||chr(166)) actually got stored in database with LATIN1 with single byte sequence, but when query select * from testutf8, it got converted to UTF8 three byte sequence first ?

jamet=# select chr(226)||chr(128)||chr(166);
?column?
----------
...
(1 row)

jamet=# select * from testutf8;
test
--------------------------------------------------------------------------------
...

jamet=# select encode(test::bytea,'hex') from testutf8;
encode

-------------------------------------------------------------------------------------------------------------------------------------------------------------
-
e280a6

Thanks,

James

-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Thursday, August 17, 2023 9:33 AM
To: James Pang (chaolpan) <chaolpan(at)cisco(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: inserts bypass encoding conversion

"James Pang (chaolpan)" <chaolpan(at)cisco(dot)com> writes:
> In this case, the real value stored in database is UTF8 byte sequence
> instead of LATIN1 encoding text, right?

Not if you have server_encoding = LATIN1, as you stated earlier.
In that case, the data in the database is in LATIN1, and chr() interprets its argument as a LATIN1 code value --- which happens to look enough like a Unicode code point to be possibly confusing, until you try to use code points that aren't within LATIN1.

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2023-08-17 02:40:32 Re: inserts bypass encoding conversion
Previous Message Tom Lane 2023-08-17 01:33:00 Re: inserts bypass encoding conversion