pg_dump encoding problem

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: pg_dump encoding problem
Date: 2006-10-19 18:22:56
Message-ID: 1161282176.8476.23.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am migrating a database from 7.4 in SQL_ASCII encoding to 8.1 in UTF8.
I made a pg_dump of the 7.4 database. I had difficulty (there are
invalid UTF8 characters in the original database, like 0xb9) going
straight into 8.1 with UTF8, so I tried importing it in a temporary 8.1
cluster that I set to be SQL_ASCII encoding. That import went fine.

So, basically, I am now trying to move data from 8.1 in SQL_ASCII to 8.1
in UTF8. I know that the text fields in UTF8 can handle the invalid
sequences because I can do:

=> create table foo(t text);
CREATE TABLE
=> insert into foo values(E'a\xb9c');
INSERT 0 1
=> insert into foo values('abc');
INSERT 0 1
=> select t,length(t) from foo;
t | length
-----+--------
ac | 3
abc | 3

That's how I want to import the data. I want the application to behave
as much like before as possible, so I would not like to strip the binary
characters.

Is there a way to get pg_dump to use the escape sequences instead of
writing the binary value? Is what I'm trying to do dangerous?

I am still investigating how the application filters the data. If it
sends the binary character inside the query, is there any way to make a
UTF8-encoded database accept that? Do I have to create a separate
database encoded with SQL_ASCII?

Regards,
Jeff Davis

Browse pgsql-general by date

  From Date Subject
Next Message Ritesh Nadhani 2006-10-19 18:38:23 Question with tsearch2 (or it might be a general one too)
Previous Message Robin Ericsson 2006-10-19 18:16:22 Re: PostgreSQL and Munin