Re: Multibyte support and accented characters

From: "M(dot) Bastin" <marcbastin(at)mindspring(dot)com>
To: Lynna Landstreet <lynna(at)gallery44(dot)org>
Cc: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Multibyte support and accented characters
Date: 2003-06-12 23:54:53
Message-ID: a05210608bb0ec00f095b@[213.224.147.214]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

At 7:08 PM -0400 6/12/03, Lynna Landstreet wrote:
>Hello all,
>
>Can you handle one more question from me? Not related to keyword checkboxes
>this time, promise. :-)
>
>Some of the text that will be entered in the database I'm working on
>includes some names and titles in other languages - predominantly French,
>but occasionally German, Spanish, etc. So I understand from reading the
>PostgreSQL docs that in order to handle this, we need to make sure multibyte
>support is enabled.
>
>Now, I am not very clear on the various encodings and how they work. I've
>been spoiled by years of working on a Mac where you just type option-e if
>you want an acute accent, option-u for an umlaut, etc. That's how most of
>the text that will be used to populate the database has been generated. So
>my questions are:
>
>1. Which encoding would be best for this? I'm guessing Unicode,

Unicode is the safest way to go indeed. It's well on its way to
become the new common standard of all computer platforms.

> but I'm not
>sure. We pretty much only have to deal with western European languages, not
>with Russian or Chinese or anything.
>
>2. Once the right one is chosen and enabled, is the process pretty much
>transparent - i.e., just enter the text and the accented characters will
>come through fine,

No:

CREATE DATABASE mydb WITH ENCODING = 'UNICODE'

Then the front-end, with which you're doing your input, must send its
data encoded in unicode UTF-8. If it sends it in another encoding,
then use:

SET CLIENT_ENCODING TO '<whatever encoding the front-end uses>'

to enable automatic translation to unicode by PostgreSQL.

Read the manual for further information:
http://www.postgresql.org/docs/view.php?version=7.3&file=multibyte.html

> or do I have to do something special with them, like the
>way they have to be encoded with &...; ASCII codes in HTML?
>
>3. Speaking of HTML, even if PostgreSQL is set up to correctly deal with
>accented characters, when the output is displayed on the web, are they going
>to have to be converted into &...; form?

Here too you have to tell the browser it's going to receive data in
unicode. I don't know whether you can do this in HTML, or whether
the user must choose unicode from the browser's appropriate menu.

Perhaps you can have PostgresQL translate the encoding to iso-latin,
the Windows standard.

It's better if someone else answers this one for you.

Marc

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Michael Glaesemann 2003-06-13 06:20:59 Exporting data from PostgreSQL
Previous Message Lee Matthews 2003-06-12 23:25:27 Re: PGSQL vs. SQL Server questions