BUG #16926: initdb fails on Windows when binary path contains certain non-ASCII characters

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: ebakke(at)ultorg(dot)com
Subject: BUG #16926: initdb fails on Windows when binary path contains certain non-ASCII characters
Date: 2021-03-14 00:43:35
Message-ID: 16926-e4ef345545d185ab@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16926
Logged by: Eirik Bakke
Email address: ebakke(at)ultorg(dot)com
PostgreSQL version: 13.1
Operating system: Windows 10
Description:

On PostgreSQL 13.1 on US English Windows 10, "initdb" will fail with the
following error if the initdb.exe executable is located on a path that
contains certain non-ASCII characters, and "--encoding=UTF8" is specified.
In the following example, I am executing initdb.exe from a folder called
"C:\Users\ebakke\ZRoot\PostgresTest\FolderÆØÅ\pgsql\bin":

=================================================
C:\Users\ebakke\ZRoot\PostgresTest\FolderÆØÅ\pgsql\bin>initdb -D
C:\Users\ebakke\Deletables\pgtest --encoding=UTF8 --locale=en_US
The files belonging to this database system will be owned by user
"ebakke".
This user must also own the server process.

The database cluster will be initialized with locale "en_US".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory C:/Users/ebakke/Deletables/pgtest ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... windows
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... US/Eastern
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... 2021-03-13 19:23:58.206 EST
[18296] FATAL: invalid byte sequence for encoding "UTF8": 0xc6 0xd8
child process exited with exit code 1
initdb: removing data directory "C:/Users/ebakke/Deletables/pgtest"
=================================================

The problem does _not_ occur on Linux or MacOS (I tested on all three OSes).
The error goes away if an ASCII-only path is used. This may not always be an
option, however, as Windows users might only have access to their home
directories, and home directory names are permitted to contain such
characters. The error also goes away if "--encoding=UTF8" is removed, but
this forces the entire database to use the Windows-1252 character set, which
is not acceptable.

The Windows binaries for PostgreSQL 13.1 were downloaded from
https://www.enterprisedb.com/download-postgresql-binaries . The error is
easy to reproduce, and occurs every time. In case the non-ASCII letters are
mangled in this message, the non-ASCII characters I tested with are the last
three ones in the Norwegian and Danish alphabet, which you can copy and
paste from https://en.wikipedia.org/wiki/Danish_and_Norwegian_alphabet .

This bug was discovered while working on Ultorg ( https://twitter.com/ultorg
), a desktop application which bundles PostgreSQL binaries and requires them
to run from wherever the user unzips the application package.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2021-03-14 08:10:15 Re: BUG #16920: Can't compile PostGIS with MingW64 against PostgreSQL 14 head
Previous Message Regina Obe 2021-03-14 00:30:03 RE: BUG #16920: Can't compile PostGIS with MingW64 against PostgreSQL 14 head