Quick Links

Re: unicode error and problem

From:	Markus Bertheau <twanger(at)bluetwanger(dot)de>
To:	Paolo Supino <paolo(at)telmap(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: unicode error and problem
Date:	2004-03-24 20:49:09
Message-ID:	1080161348.1988.6.camel@yarrow.bertheau.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

В Срд, 24.03.2004, в 11:33, Paolo Supino пишет:
> Hi
>
> I received a unicode CSV file from someone (the file was created on a
> windows system) and I'm trying to import it into postgresql. When it gets to
> a line that isn't ascii it prints the following error and aborts: "ERROR:
> copy: line 33, Invalid UNICODE character sequence found (0xd956)".

Try to convert the file from UTF-16 (which might be the encoding of the
file) to UTF-8 with iconv:

iconv --from UTF-16 --to UTF-8 file > file.UTF-8

Maybe the file is not in UTF-16 but in some other encoding - convert
accordingly then.

By the way, Unicode is just a number -> glyph mapping, it doesn't say
anything about the representation of that number in the byte stream.
UTF-8 and UTF-16 are such representation specifications.

The encoding name in PostgreSQL should be changed from UNICODE to UTF-8
because UNICODE really just isn't an encoding.

--
Markus Bertheau <twanger(at)bluetwanger(dot)de>

In response to

unicode error and problem at 2004-03-24 10:33:27 from Paolo Supino

Responses

Re: [HACKERS] unicode error and problem at 2004-03-25 04:17:53 from Tatsuo Ishii

Browse pgsql-general by date

	From	Date	Subject
Next Message	Anony Mous	2004-03-24 20:55:07	Re: pg_dump "what if?"
Previous Message	Wes Palmer	2004-03-24 20:44:26	too many arguments to function `getpwuid_r'

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Garamond	2004-03-24 21:22:52	Re: subversion vs cvs (Was: Re: linked list rewrite)
Previous Message	Andrew Hammond	2004-03-24 20:36:20	rotatelogs integration in pg_ctl