Quick Links

Logical Replication and Character encoding

From:	"Shinoda, Noriyoshi" <noriyoshi(dot)shinoda(at)hpe(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Logical Replication and Character encoding
Date:	2017-01-31 12:46:18
Message-ID:	AT5PR84MB0084FAE5976D89CDE9733093EE4A0@AT5PR84MB0084.NAMPRD84.PROD.OUTLOOK.COM
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

I tried a committed Logical Replication environment. I found that replication between databases of different encodings did not convert encodings in character type columns. Is this behavior correct?

I expected that the character 0xe6bca2 (UTF-8) would be converted to the same character 0xb4c1 (EUC_JP). The example below replicates from the encoding UTF-8 database to the encoding EUC_JP database. You can see that the character 0xe6bca2 (UTF-8) is stored intact in the SUBSCRIPTION side database.

postgres=> CREATE TABLE encode1(col1 NUMERIC PRIMARY KEY, col2 VARCHAR(10)) ;
CREATE TABLE
postgres=> CREATE PUBLICATION pub1 FOR TABLE encode1 ;
CREATE PUBLICATION
postgres=> INSERT INTO encode1 VALUES (1, '漢') ; -- UTF-8 Character 0xe6bca2
INSERT 0 1

postgres=> CREATE TABLE encode1(col1 NUMERIC PRIMARY KEY, col2 VARCHAR(10)) ;
CREATE TABLE
postgres=# CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub1 ;
NOTICE: created replication slot "sub1" on publisher
CREATE SUBSCRIPTION
postgres=# SELECT * FROM encode1 ;
ERROR: invalid byte sequence for encoding "EUC_JP": 0xa2
postgres=# SELECT heap_page_items(get_raw_page('encode1', 0)) ;
heap_page_items
-------------------------------------------------------------------
(1,8152,1,33,565,0,0,"(0,1)",2,2306,24,,,"\\x0b0080010009e6bca2") <- stored UTF-8 char 0xe6bca2
(1 row)

Snapshot:
https://ftp.postgresql.org/pub/snapshot/dev/postgresql-snapshot.tar.gz 2017-01-31 00:29:07
Operating System:
Red Hat Enterprise Linux 7 Update 2 (x86-64)

Regards.

Responses

Re: Logical Replication and Character encoding at 2017-02-01 03:05:40 from Kyotaro HORIGUCHI

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ashutosh Bapat	2017-01-31 13:04:37	Re: An issue in remote query optimization
Previous Message	Haribabu Kommi	2017-01-31 12:39:53	Re: macaddr 64 bit (EUI-64) datatype support