Re: encoding advice requested

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Rick Schumeyer" <rschumeyer(at)ieee(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding advice requested
Date: 2006-11-12 13:43:44
Message-ID: 20061112144324.28920
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Rick Schumeyer wrote:

> My database locale is en_US, and by default my databases are UTF8.
>
> My application code allows the user to paste text into a box and submit
> it to the database. Sometimes the pasted text contains non UTF8
> characters, typically the "fancy" forms of quotes and apostrophes. The
> database does not appreciate it when the application attempts to store
> these characters.
>
> What is the best option to deal with this problem?
>
> a) I think I could re-create the database with a LATIN1 encoding. I'm
> not real experienced with different encodings, are there any issues with
> combining en_US and LATIN1?
> b) I can issue a SET CLIENT_ENCODING TO 'LATIN1'; statement every time I
> open a connection. A brief test indicates this will work.

Be aware that "fancy" quotes and apostrophes are not representable in
LATIN1, the closest character set in which they are is probably
WIN1252. See http://en.wikipedia.org/wiki/Windows-1252, especially
characters in the 0x91-0x94 range.
Maybe your application implicitly uses this encoding, especially
if it runs under Windows, in which case the more appropriate
solution to your problem would be to set the client_encoding to
WIN1252 while keeping your database in UTF8.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2006-11-12 15:33:41 Re: Linux vs. FreeBSD
Previous Message Chris Mair 2006-11-12 13:16:26 Re: Why isn't it allowed to create an index in a schema