Re: Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: jan(dot)lentfer(at)web(dot)de
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail
Date: 2016-01-27 00:45:10
Message-ID: 20160127.094510.1792875178673024937.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I am not familiar with pgadmin3 but it seems a bug with pgadmin3 or
misuse of it to me. Probably when this:

> ---> use pgadmin3 on Windows to set comment on database coment_test with
> ---> string "für", while being connected to postgres (SQL_ASCII encoded)

was executed, the client encoding is not properly set to UTF-8 for
some reasons.

I did almost similar thing except that I used Japanese HIRAGANA
character (I do not write German) and used psql to input COMMENT
command. Also the PostgreSQL version is 9.5.0, which is different from
yours but I don't think there has been no change in this area.

t-ishii(at)localhost: initdb --no-locale -E SQL_ASCII -D /tmp/testdb
The files belonging to this database system will be owned by user "t-ishii".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".
[snip]
Success. You can now start the database server using:

pg_ctl -D /tmp/testdb -l logfile start

t-ishii(at)localhost: export PGPORT=5435
t-ishii(at)localhost: pg_ctl -D /tmp/testdb/ start
t-ishii(at)localhost: psql -l
Pager usage is off.
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+---------+-----------+---------+-------+-------------------------
postgres | t-ishii | SQL_ASCII | C | C |
template0 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +
| | | | | "t-ishii"=CTc/"t-ishii"
template1 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +
| | | | | "t-ishii"=CTc/"t-ishii"
(3 rows)

t-ishii(at)localhost: psql postgres
postgres=# create database comment_test template template0 encoding 'UTF-8';
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
--------------+---------+-----------+---------+-------+-------------------------
comment_test | t-ishii | UTF8 | C | C |
postgres | t-ishii | SQL_ASCII | C | C |
template0 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +
| | | | | "t-ishii"=CTc/"t-ishii"
template1 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +
| | | | | "t-ishii"=CTc/"t-ishii"
(4 rows)

postgres=# \encoding UTF8
postgres=# comment on database comment_test is 'あいう';
COMMENT
postgres=# \q
t-ishii(at)localhost: pg_dump -Fc comment_test -f /tmp/comment_test.pgdump
t-ishii(at)localhost: psql postgres
Pager usage is off.
psql (9.5.0)
Type "help" for help.

postgres=# alter database comment_test rename to comment_test_orig;
ALTER DATABASE
postgres-# \q
t-ishii(at)localhost: pg_restore -C -d template1 /tmp/comment_test.pgdump
t-ishii(at)localhost: psql postgres
Pager usage is off.
psql (9.5.0)
Type "help" for help.

postgres=#
postgres=# \l+
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
-------------------+---------+-----------+---------+-------+-------------------------+---------+------------+--------------------------------------------
comment_test | t-ishii | UTF8 | C | C | | 6992 kB | pg_default | あいう
comment_test_orig | t-ishii | UTF8 | C | C | | 6992 kB | pg_default | あいう
postgres | t-ishii | SQL_ASCII | C | C | | 6992 kB | pg_default | default administrative connection database
template0 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +| 6873 kB | pg_default | unmodifiable empty database
| | | | | "t-ishii"=CTc/"t-ishii" | | |
template1 | t-ishii | SQL_ASCII | C | C | =c/"t-ishii" +| 6992 kB | pg_default | default template for new databases
| | | | | "t-ishii"=CTc/"t-ishii" | | |
(5 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> We found a problem with comments on databases using german umlauts
> (äöü) - or more generally, we found encoding problems when doing
> "COMMENT ON database foo ..." when foo is UTF-8 encoded and the
> database you are connected to (e.g. postgres) is SQL_ASCII.
> I analyzed this with the help of Andrew Gierth (and others) on IRC -
> it seems in that constellation you can write an non-UTF-8 comment onto
> a UTF-8 database.
> This leads to the problem, that wen trying to use pg_restore -C with a
> dump created with -Fc the restore will fail when trying to do the
> "COMMENT ON DATABASE.."
> We stumpled across this because also pg_upgrade fails on this (during
> the schema part), but doesn't detect this situation with the -c
> (check) option beforehand.
> This is 9.4.5 on Solaris 11 - but we did have that problem already
> when using pg_upgrade from 8.4 to 9.1 a few years back. I am sorry we
> didn't analyze further back then, but it was "just a comment". But now
> it bit us again....
>
> Here is what I did so far to analyze the problem (with instrucions
> from Andrew Gierth). I hope this makes it clear, otherwise please
> don't hesitate to request more details.
>
>
> postgres=# create database comment_test template template0 encoding
> 'UTF-8';
> CREATE DATABASE
> #####
>
> ---> use pgadmin3 on Windows to set comment on database coment_test with
> ---> string "für", while being connected to postgres (SQL_ASCII encoded)
> #####
>
> $ pg_dump -Fc comment_test -f comment_test.pgdump
> postgres=# alter database comment_test rename to comment_test_orig;
> ALTER DATABASE
> #####
>
> $ pg_restore -C -d template1 comment_test.pgdump
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 1989; 1262 641528
> COMMENT comment_test postgres
> pg_restore: [archiver (db)] could not execute query: ERROR: invalid
> byte sequence for encoding "UTF8": 0xfc
> Command was: COMMENT ON DATABASE comment_test IS 'f¦r';
> ###
>
> postgres(at)sz-pp-pg02-dev[~] $ pg_restore comment_test.pgdump
> --
> -- PostgreSQL database dump
> --
>
> SET statement_timeout = 0;
> SET lock_timeout = 0;
> SET client_encoding = 'UTF8';
> SET standard_conforming_strings = on;
> SET check_function_bodies = false;
> SET client_min_messages = warning;
>
> --
> -- Name: comment_test; Type: COMMENT; Schema: -; Owner: postgres
> --
>
> COMMENT ON DATABASE comment_test IS 'f¦r';
>
>
> --
> -- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
> --
>
> CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
>
>
> --
> -- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
> --
>
> COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
>
>
> --
> -- Name: public; Type: ACL; Schema: -; Owner: postgres
> --
>
> REVOKE ALL ON SCHEMA public FROM PUBLIC;
> REVOKE ALL ON SCHEMA public FROM postgres;
> GRANT ALL ON SCHEMA public TO postgres;
> GRANT ALL ON SCHEMA public TO PUBLIC;
>
>
> --
> -- PostgreSQL database dump complete
> --
> #######
>
> $ pg_restore comment_test.pgdump | head -16 | tail -1 |xxd
> 0000000: 434f 4d4d 454e 5420 4f4e 2044 4154 4142 COMMENT ON DATAB
> 0000010: 4153 4520 636f 6d6d 656e 745f 7465 7374 ASE comment_test
> 0000020: 2049 5320 2766 fc72 273b 0a IS 'f.r';.
> ######
>
> postgres=# select
> convert_to(shobj_description(oid,'pg_database'),'SQL_ASCII') from
> pg_database where datname='comment_test_orig';
> convert_to
> ------------
> \x66fc72
> (1 row)
> ######
>
>
> postgres=# \l
> List of databases
> Name | Owner | Encoding | Collate | Ctype | Access privileges
> --------------------+-----------+-----------+------------------+------------------+-----------------------
> comment_test | postgres | UTF8 | C | C |
> [...]
> postgres | postgres | SQL_ASCII | C | C |
> template0 | postgres | SQL_ASCII | C | C | =c/postgres +
> | | | | | postgres=CTc/postgres
> template1 | postgres | SQL_ASCII | C | C | postgres=CTc/postgres+
> | | | | | =c/postgres
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-01-27 01:27:43 Re: BUG #13888: pg_dump write error
Previous Message Peter Geoghegan 2016-01-26 20:29:55 Re: BUG #13891: Deparsed arbiter WHERE clauses cannot be parsed by Postgres