From: | "Huang, Suya" <Suya(dot)Huang(at)au(dot)experian(dot)com> |
---|---|
To: | John R Pierce <pierce(at)hogranch(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: [ADMIN] what's the efficient/safest way to convert database character set ? |
Date: | 2013-10-18 04:49:21 |
Message-ID: | D83E55F5F4D99B4A9B4C4E259E6227CD9DF35C@AUX1EXC01.apac.experian.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Yes John, we probably will use a new database server here to accommodate those converted database.
By saying export/import, do you mean by :
1. pg_dump (//should I specify -E UTF 8 to dump the data in UTF-8 encoding?)
2. create database xxx -E UTF8
3. pg_restore
I also see someone's doing this by the following way:
1. perform a plain text dump of database.
pg_dump -f db.sql [dbname]
2. convert the character encodings.
iconv db.sql -f ISO-8859-1 -t UTF-8 -o db.utf8.sql
3. create the UTF8 database
createdb utf8db (// I'm not sure why he's not specifying DB encoding here, maybe better use -E to specify the encoding as UTF8)
4.restore the converted UTF8 database.
psql -d utf8db -f db.utf8.sql
which method is better? For what I can tell now is the second approach would generate bigger dump file size, so better to pipe it to bzip to have a compressed file. But other than that, any other considerations?
Thanks,
Suya
-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of John R Pierce
Sent: Friday, October 18, 2013 11:23 AM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] [ADMIN] what's the efficient/safest way to convert database character set ?
On 10/17/2013 3:13 PM, Huang, Suya wrote:
> I've got a question of converting database from ascii to UTF-8, what's
> the best approach to do so if the database size is very large?
> Detailed procedure or experience sharing are much appreciated!
I believe you will need to dump the whole database, and import it into a
new database that uses UTF8 encoding. Ss far as I know, there's no way
to convert encoding in place. As the other gentlemen pointed out, you
also will have to convert/sanitize all text data, as your current
SQL_ASCII fields could easily contain stuff that's not valid UTF8.
for large databases, this is a major undertaking. I find its often
easiest to do a major change like this between the old and a new
database server.
--
john r pierce 37N 122W
somewhere on the middle of the left coast
--
Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
From | Date | Subject | |
---|---|---|---|
Next Message | John R Pierce | 2013-10-18 05:11:48 | Re: [ADMIN] what's the efficient/safest way to convert database character set ? |
Previous Message | Huang, Suya | 2013-10-18 04:39:52 | Re: [ADMIN] what's the efficient/safest way to convert database character set ? |