From: | Arnaud Lesauvage <thewild(at)freesurf(dot)fr> |
---|---|
To: | Arnaud Lesauvage <thewild(at)freesurf(dot)fr>, Tomi NA <hefest(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: MSSQL to PostgreSQL : Encoding problem |
Date: | 2006-11-22 14:34:34 |
Message-ID: | 45645FFA.2040006@freesurf.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Alvaro Herrera a écrit :
> Arnaud Lesauvage wrote:
>> Alvaro Herrera a écrit :
>> >Arnaud Lesauvage wrote:
>> >>Tomi NA a écrit :
>> >>>>I think I'll go this way... No other choice, actually !
>> >>>>The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS.
>> >>>>I don't really understand what this is. It supports the euro
>> >>>>symbol, so it is probably not pure LATIN1, right ?
>> >>>
>> >>>I suppose you'd have to look at the latin1 codepage character table
>> >>>somewhere...I'm a UTF-8 guy so I'm not well suited to respond to the
>> >>>question. :)
>> >>
>> >>Yep, http://en.wikipedia.org/wiki/Latin-1 tells me that
>> >>LATIN1 is missing the euro sign...
>> >>Grrrrr I hate this !!!
>> >
>> >So use Latin9 ...
>>
>> Of course, but it doesn't work !!!
>> Whatever client encoding I choose in postgresql before
>> COPYing, I get the 'invalid byte sequence error'.
>
> Humm ... how are you choosing the client encoding? Is it actually
> working? I don't see how choosing Latin1 or Latin9 and feeding whatever
> byte sequence would give you an "invalid byte sequence". These charsets
> don't have any way to validate the bytes, as opposed to what UTF-8 can
> do. So you could end up with invalid bytes if you choose the wrong
> client encoding, but that's a different error.
>
mydb=# SET client_encoding TO LATIN9;
SET
mydb=# COPY statistiques.detailrecherche (log_gid,
champrecherche, valeurrecherche) FROM
'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV;
ERROR: invalid byte sequence for encoding "LATIN9": 0x00
HINT: This error can also happen if the byte sequence does
not match the encoding expected by the server, which is
controlled by "client_encoding".
CONTEXT: COPY detailrecherche, line 9212
mydb=# SET client_encoding TO WIN1252;
SET
mydb=# COPY statistiques.detailrecherche (log_gid,
champrecherche, valeurrecherche) FROM
'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV;
ERROR: invalid byte sequence for encoding "WIN1252": 0x00
HINT: This error can also happen if the byte sequence does
not match the encoding expected by the server, which is
controlled by "client_encoding".
CONTEXT: COPY detailrecherche, line 9212
Really, I'd rather have another error, but this is all I can
get.
This is with the "ANSI" export.
With the "UNICODE" export :
mydb=# SET client_encoding TO UTF8;
SET
mydb=# COPY statistiques.detailrecherche (log_gid,
champrecherche, valeurrecherche) FROM
'E:\\Production\\Temp\\detailrecherche_unicode.csv' CSV;
ERROR: invalid byte sequence for encoding "UTF8": 0xff
HINT: This error can also happen if the byte sequence does
not match the encoding expected by the server, which is
controlled by "client_encoding".
CONTEXT: COPY detailrecherche, line 592680
So, line 592680 is *a lot* better, but it is still not good!
--
Arnaud
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas H. | 2006-11-22 14:36:03 | Re: MSSQL to PostgreSQL : Encoding problem |
Previous Message | Arnaud Lesauvage | 2006-11-22 14:34:19 | Re: MSSQL to PostgreSQL : Encoding problem |