From: | Markus Bertheau <twanger(at)bluetwanger(dot)de> |
---|---|
To: | Paolo Supino <paolo(at)telmap(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: unicode error and problem |
Date: | 2004-03-24 20:49:09 |
Message-ID: | 1080161348.1988.6.camel@yarrow.bertheau.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
В Срд, 24.03.2004, в 11:33, Paolo Supino пишет:
> Hi
>
> I received a unicode CSV file from someone (the file was created on a
> windows system) and I'm trying to import it into postgresql. When it gets to
> a line that isn't ascii it prints the following error and aborts: "ERROR:
> copy: line 33, Invalid UNICODE character sequence found (0xd956)".
Try to convert the file from UTF-16 (which might be the encoding of the
file) to UTF-8 with iconv:
iconv --from UTF-16 --to UTF-8 file > file.UTF-8
Maybe the file is not in UTF-16 but in some other encoding - convert
accordingly then.
By the way, Unicode is just a number -> glyph mapping, it doesn't say
anything about the representation of that number in the byte stream.
UTF-8 and UTF-16 are such representation specifications.
The encoding name in PostgreSQL should be changed from UNICODE to UTF-8
because UNICODE really just isn't an encoding.
--
Markus Bertheau <twanger(at)bluetwanger(dot)de>
From | Date | Subject | |
---|---|---|---|
Next Message | Anony Mous | 2004-03-24 20:55:07 | Re: pg_dump "what if?" |
Previous Message | Wes Palmer | 2004-03-24 20:44:26 | too many arguments to function `getpwuid_r' |
From | Date | Subject | |
---|---|---|---|
Next Message | David Garamond | 2004-03-24 21:22:52 | Re: subversion vs cvs (Was: Re: linked list rewrite) |
Previous Message | Andrew Hammond | 2004-03-24 20:36:20 | rotatelogs integration in pg_ctl |