Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alon <asimantov(at)tableausoftware(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date: 2014-09-21 05:18:46
Message-ID: 20140921051846.GA1565935@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
> The pg_dump file contains this command:
> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
> LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
> (Bokmål)_Norway.1252';
>
> The UTF16 encoding for ål) [a-ring l parenthesis] is
> 00e5 006c 0029
>
> In UTF8 this set of characters encoded as:
> c3 a5 6c 29
>
> The a-ring is converted to two bytes while the others are one.
>
> Based on the ERROR:
> invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29
>
> It appears the set of characters is getting passed as:
> e5 6c 29
>
> In UTF8, e5 is always the start of a three byte character,possibly
> pg_restore, ceratedb or else, tries to read these bytes as a single
> character.
> However, 6c and 29 can only be single byte characters, they can't be the
> next two bytes in a three byte character. Hence the failure.
> Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
> passing the LC_COLLATE and LC_CTYPE values.

In WIN1252, "e5 6c 29" is "ål)". We're likely failing to set client_encoding
at some essential point in the process.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message John R Pierce 2014-09-21 05:31:36 Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Previous Message Maxim Boguk 2014-09-21 04:06:06 Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit