Re: MSSQL to PostgreSQL : Encoding problem

From: Richard Huxton <dev(at)archonet(dot)com>
To: Arnaud Lesauvage <thewild(at)freesurf(dot)fr>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: MSSQL to PostgreSQL : Encoding problem
Date: 2006-11-21 22:12:19
Message-ID: 456379C3.7060802@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Arnaud Lesauvage wrote:
> Hi list !
>
> I already posted this as "COPY FROM encoding error", but I have been
> doing some more tests since then.
>
> I'm trying to export data from MS SQL Server to PostgreSQL.
> The tables are quite big (>20M rows), so a CSV export and a "COPY FROM3
> import seems to be the only reasonable solution.

Or go via MS-Access/Perl and ODBC/DBI perhaps?

> In DTS, I have 3 options to export a table as a text file : ANSI, OEM
> and UNICODE.
> I tried all these options (and I have three files, one for each).

Well, what character-set is your database in?

> I then try to import into PostgreSQL. The farther I can get is when
> using the UNICODE export, and importing it using a client_encoding set
> to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...).
> The copy then stops with an error :
> ERROR: invalid byte sequence for encoding "UTF8": 0xff
> État SQL :22021
>
> The problematic character is the euro currency symbol.

You'll want UTF-8 or LATIN9 for the euro symbol. LATIN1 supports that
character-number but it is used for a different symbol.

Your first step needs to be to find out what character-set your data is in.
Your second is then to decide what char-set you want to use to store it
in PG.
Then you can decide how to get there.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bill Kurland 2006-11-21 23:16:15 Upgrade problem
Previous Message Scott Marlowe 2006-11-21 22:02:00 Re: more on database design