Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From: Alon <asimantov(at)tableausoftware(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date: 2014-09-19 22:15:53
Message-ID: 1411164953375-5819745.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The pg_dump file contains this command:
CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
(Bokmål)_Norway.1252';

The UTF16 encoding for ål) [a-ring l parenthesis] is
00e5 006c 0029

In UTF8 this set of characters encoded as:
c3 a5 6c 29

The a-ring is converted to two bytes while the others are one.

Based on the ERROR:
invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29

It appears the set of characters is getting passed as:
e5 6c 29

In UTF8, e5 is always the start of a three byte character,possibly
pg_restore, ceratedb or else, tries to read these bytes as a single
character.
However, 6c and 29 can only be single byte characters, they can't be the
next two bytes in a three byte character. Hence the failure.
Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
passing the LC_COLLATE and LC_CTYPE values.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5819745.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message npage 2014-09-20 00:23:14 BUG #11457: The below query crashes 9.3.5, but not 9.3.4
Previous Message Caleb Epstein 2014-09-19 21:12:58 Re: BUG #11455: PQerrorMessage not reset after PQreset