Trying to understand encoding.

From: Tomás Di Doménico <tdidomenico(at)avature(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Trying to understand encoding.
Date: 2008-02-15 13:46:38
Message-ID: 47B597BE.9070109@avature.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings.

I'm currently using 8.3, but I've been coping with this since previous
versions.

I'm trying to integrate some LATIN1 and some UTF8 DBs into a single UTF8
one. To avoid the "Invalid UNICODE character..." error, I used iconv to
convert the LATIN1 dumps to UTF8.

Now I have the data into the UTF8 DB, and using graphical clients
everything seems to be great. The thing is, when I query the data via
psql, with \encoding UTF8 I get weird data ("Neuquén" for "Neuquén").
However, with \encoding LATIN1, everything looks fine.

So, I have a UTF8 DB, (what I think is) UTF8 data, and I can only see it
right by setting \encoding to LATIN1 in psql, or using a graphical client.

If anyone could help me try and understand this mess, I'd really
appreciate it.

Ah, these are my locale settings, in case it helps.

LANG=en_US.UTF-8
LC_CTYPE=C
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Harald Fuchs 2008-02-15 13:56:50 Re: Are indexes blown?
Previous Message Richard Huxton 2008-02-15 13:41:18 Re: Are indexes blown?