Quick Links

UTF-8 encoding problem w/ libpq

From:	Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	UTF-8 encoding problem w/ libpq
Date:	2013-06-03 14:40:14
Message-ID:	11A8567A97B15648846060F5CD818EB8CAC2253F5E@DEV001EX.Dev.cadcorp.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I try to create database columns with umlauts, using the UTF8 client encoding. However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operation on each byte of the UTF-8 multi-byte sequence.

Here is my code:

const wchar_t *strName = L"id_äß";
wstring strCreate = wstring(L"create table test_umlaut(") + strName + L" integer primary key)";

PGconn *pConn = PQsetdbLogin("", "", NULL, NULL, "dev503", "postgres", "******");
if (!pConn) FAIL;
if (PQsetClientEncoding(pConn, "UTF-8")) FAIL;

PGresult *pResult = PQexec(pConn, "drop table test_umlaut");
if (pResult) PQclear(pResult);

pResult = PQexec(pConn, ToUtf8(strCreate.c_str()).c_str());
if (pResult) PQclear(pResult);

pResult = PQexec(pConn, "select * from test_umlaut");
if (!pResult) FAIL;
if (PQresultStatus(pResult)!=PGRES_TUPLES_OK) FAIL;
if (PQnfields(pResult)!=1) FAIL;
const char *fName = PQfname(pResult,0);

ShowW("Name: ", strName);
ShowA("in UTF8: ", ToUtf8(strName).c_str());
ShowA("from DB: ", fName);
ShowW("in UTF16: ", ToWide(fName).c_str());

PQclear(pResult);
PQreset(pConn);

(ShowA/W call OutputDebugStringA/W, and ToUtf8/ToWide use WideCharToMultiByte/MultiByteToWideChar with CP_UTF8.)

And this is the output generated:

Name: id_äß
in UTF8: id_Ã¤ÃŸ
from DB: id_ã¤ãÿ
in UTF16: id_???

It seems like the backend thinks the name is in ANSI encoding, not in UTF-8.
If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work without quoting the column name.
Am I missing some setup in either the database or in the use of libpq?

I’m using PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit

The database uses:
ENCODING = 'UTF8'
LC_COLLATE = 'English_United Kingdom.1252'
LC_CTYPE = 'English_United Kingdom.1252'

Thanks for any help,

Martin

Responses

Re: UTF-8 encoding problem w/ libpq at 2013-06-03 14:47:59 from ktm@rice.edu

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	ktm@rice.edu	2013-06-03 14:47:59	Re: UTF-8 encoding problem w/ libpq
Previous Message	Tom Lane	2013-06-03 14:31:23	Re: Perl 5.18 breaks pl/perl regression tests?