Quick Links

Re: How to find freak UTF-8 character?

From:	Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: How to find freak UTF-8 character?
Date:	2011-10-01 19:29:45
Message-ID:	20111001192706.GA44962@shinkuro.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sat, Oct 01, 2011 at 07:55:01AM +0200, Leif Biberg Kristensen wrote:
> I've somehow introduced a spurious UTF-8 character in my database. When I try
> to export to an application that requires LATIN1 encoding, my export script
> bombs out with this message:
>
> psycopg2.DataError: character 0xe2808e of encoding "UTF8" has no equivalent in
> "LATIN1"

I see you found it, but note that it's _not_ a spurious UTF-8
character: it's a right-to-left mark, ans is a perfectly ok UTF-8 code
point.

If you need a subset of the UTF-8 character set, you want to make sure
you have some sort of constraint in your application or your database
that prevents insertion of anything at all in UTF-8. This is a need
people often forget when working in an internationalized setting,
because there's a lot of crap that comes from the client side in a
UTF-8 setting that might not come in other settings (like LATIN1).

Best,

--
Andrew Sullivan
ajs(at)crankycanuck(dot)ca

In response to

How to find freak UTF-8 character? at 2011-10-01 05:55:01 from Leif Biberg Kristensen

Responses

Re: How to find freak UTF-8 character? at 2011-10-01 21:16:06 from Leif Biberg Kristensen

Browse pgsql-general by date

	From	Date	Subject
Next Message	Merlin Moncure	2011-10-01 20:05:41	Re: bytea columns and large values
Previous Message	Andreas	2011-10-01 17:43:52	Log-Info Replication