Re: Multibyte support and accented characters

From: Lynna Landstreet <lynna(at)gallery44(dot)org>
To: "M(dot) Bastin" <marcbastin(at)mindspring(dot)com>
Cc: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Multibyte support and accented characters
Date: 2003-06-17 18:43:58
Message-ID: BB14D9AE.3C9%lynna@gallery44.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

on 6/12/03 7:54 PM, M. Bastin at marcbastin(at)mindspring(dot)com wrote:

>>1. Which encoding would be best for this? I'm guessing Unicode,
>
>Unicode is the safest way to go indeed. It's well on its way to become
>the new common standard of all computer platforms.

Cool, that's what I thought.

>>2. Once the right one is chosen and enabled, is the process pretty much
>>transparent - i.e., just enter the text and the accented characters will
>>come through fine,
>
>Then the front-end, with which you're doing your input, must send its
>data encoded in unicode UTF-8. If it sends it in another encoding, then
>use:
>
>SET CLIENT_ENCODING TO '<whatever encoding the front-end uses>'
>
>to enable automatic translation to unicode by PostgreSQL.

Er... This may sounds like a dumb question, but the description of this list
*did* say no question was too basic here... How do I tell what encoding the
program I'm entering the data with (currently FileMaker Pro on a Mac) is
using?

Once the database is up on the web, further data entry will be via a web
form processed with PHP, so I presume in that case I can use PHP to control
the encoding.

>Read the manual for further information:
>http://www.postgresql.org/docs/view.php?version=7.3&file=multibyte.html

I actually did read that before posing that question, but was still pretty
confused, thus my post here. :-)

>>3. Speaking of HTML, even if PostgreSQL is set up to correctly deal with
>>accented characters, when the output is displayed on the web, are they going
>>to have to be converted into &...; form?
>
>Here too you have to tell the browser it's going to receive data in
>unicode. I don't know whether you can do this in HTML, or whether the
>user must choose unicode from the browser's appropriate menu.

Maybe using a Content-Type meta tag like the one Dreamweaver automatically
inserts in everything? The default one it uses is <meta
http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> - I
presume I'd just change iso-8859-1 to unicode?

In my experience, relying on users to change their browser settings to
accommodate your site is usually a very bad idea. 3/4 of them don't know how
and the rest can't be bothered.

>Perhaps you can have PostgresQL translate the encoding to iso-latin, the
>Windows standard.

Not sure if that would work - the default charset for most web pages seems
to be iso-8859-1, but that still requires accented characters to use ASCII
codes - it can't handle them being typed directly in your text.

I don't really mind if I have to do a global find-and-replace on the
exported text from the existing FileMaker Pro database to turn all the
accented characters into ASCII codes, but it would be a pain for everyone
entering data in the future to have to use those. Most of the people working
this will not have said codes all memorized, the way I do from making web
sites for 6-7 years.

I should find out how LiveJournal.com handles encoding. I know there I can
type accented characters in directly in their forms and they seem to display
properly.

Lynna
--
Resource Centre Database Coordinator
Gallery 44
www.gallery44.org

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Rory Campbell-Lange 2003-06-17 18:58:28 Re: use cursor in a function
Previous Message Joe Conway 2003-06-17 18:19:58 Re: use cursor in a function