Re: [ADMIN] what's the efficient/safest way to convert database character set ?

From: "Huang, Suya" <Suya(dot)Huang(at)au(dot)experian(dot)com>
To: Steve Atkins <steve(at)blighty(dot)com>, "pgsql-general(at)postgresql(dot)org General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [ADMIN] what's the efficient/safest way to convert database character set ?
Date: 2013-10-18 03:51:23
Message-ID: D83E55F5F4D99B4A9B4C4E259E6227CD9DF2E5@AUX1EXC01.apac.experian.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks Steve,

Yes, we're using SQL_ASCII.

Would you please be more specific about manual data cleanup work here? I'm new to Postgres and don't have any experience in character set conversion before, so any specific experience shared would be very much appreciated.

Thanks,
Suya

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Steve Atkins
Sent: Friday, October 18, 2013 11:08 AM
To: pgsql-general(at)postgresql(dot)org General
Subject: Re: [GENERAL] [ADMIN] what's the efficient/safest way to convert database character set ?

On Oct 17, 2013, at 3:13 PM, "Huang, Suya" <Suya(dot)Huang(at)au(dot)experian(dot)com> wrote:

> Hi,
>
> I've got a question of converting database from ascii to UTF-8, what's the best approach to do so if the database size is very large? Detailed procedure or experience sharing are much appreciated!
>

The answer to that depends on what you mean by "ascii".

If your current database uses SQL_ASCII encoding - that's not ascii. It could have anything in there, including any mix of encodings and there's been no enforcement of any encoding, so there's no way of knowing what they are. If you've had, for example, webapps that let people paste word documents into them, you potentially have different encodings used in different rows of the same table.

If your current data is like that then you're probably looking at doing some (manual) data cleanup to work out what encoding your data is really in, and converting it to something consistent rather than a simple migration from ascii to utf8.

Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2013-10-18 04:05:02 Re: [ADMIN] what's the efficient/safest way to convert database character set ?
Previous Message John R Pierce 2013-10-18 00:23:20 Re: [ADMIN] what's the efficient/safest way to convert database character set ?