Re: error while trying to change the database encoding on a database

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Geoffrey Myers <lists(at)serioustechnology(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: error while trying to change the database encoding on a database
Date: 2011-01-24 17:53:53
Message-ID: 20110124175353.GA1909@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jan 24, 2011 at 12:16:46PM -0500, Geoffrey Myers wrote:
> We hope to identify the characters and fix them in the existing
> database, then convert. It appears to be very limited, but it would
> help if there was some way to identify these characters outside of
> simply doing the reload of the data and finding the errors.
>
> Hence the reason I asked about a resource that might identify the
> characters.

Short answer, any byte with the high bit set.

You're going to need to assign them a meaning. Additionally you're
going to have to fix your code to only output correct encoded data.

The suggestion to simply reload the database as if all the current data
was WIN1251 or Latin-9 is a fairly easy way to getting the database
into a reasonable format. The data would have to be checked though.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2011-01-24 18:40:36 Re: error while trying to change the database encoding on a database
Previous Message Geoffrey Myers 2011-01-24 17:16:46 Re: error while trying to change the database encoding on a database