From: | Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | UTF-8 encoding problem w/ libpq |
Date: | 2013-06-03 14:40:14 |
Message-ID: | 11A8567A97B15648846060F5CD818EB8CAC2253F5E@DEV001EX.Dev.cadcorp.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I try to create database columns with umlauts, using the UTF8 client encoding. However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operation on each byte of the UTF-8 multi-byte sequence.
Here is my code:
const wchar_t *strName = L"id_äß";
wstring strCreate = wstring(L"create table test_umlaut(") + strName + L" integer primary key)";
PGconn *pConn = PQsetdbLogin("", "", NULL, NULL, "dev503", "postgres", "******");
if (!pConn) FAIL;
if (PQsetClientEncoding(pConn, "UTF-8")) FAIL;
PGresult *pResult = PQexec(pConn, "drop table test_umlaut");
if (pResult) PQclear(pResult);
pResult = PQexec(pConn, ToUtf8(strCreate.c_str()).c_str());
if (pResult) PQclear(pResult);
pResult = PQexec(pConn, "select * from test_umlaut");
if (!pResult) FAIL;
if (PQresultStatus(pResult)!=PGRES_TUPLES_OK) FAIL;
if (PQnfields(pResult)!=1) FAIL;
const char *fName = PQfname(pResult,0);
ShowW("Name: ", strName);
ShowA("in UTF8: ", ToUtf8(strName).c_str());
ShowA("from DB: ", fName);
ShowW("in UTF16: ", ToWide(fName).c_str());
PQclear(pResult);
PQreset(pConn);
(ShowA/W call OutputDebugStringA/W, and ToUtf8/ToWide use WideCharToMultiByte/MultiByteToWideChar with CP_UTF8.)
And this is the output generated:
Name: id_äß
in UTF8: id_äß
from DB: id_ã¤ãÿ
in UTF16: id_???
It seems like the backend thinks the name is in ANSI encoding, not in UTF-8.
If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work without quoting the column name.
Am I missing some setup in either the database or in the use of libpq?
I’m using PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit
The database uses:
ENCODING = 'UTF8'
LC_COLLATE = 'English_United Kingdom.1252'
LC_CTYPE = 'English_United Kingdom.1252'
Thanks for any help,
Martin
From | Date | Subject | |
---|---|---|---|
Next Message | ktm@rice.edu | 2013-06-03 14:47:59 | Re: UTF-8 encoding problem w/ libpq |
Previous Message | Tom Lane | 2013-06-03 14:31:23 | Re: Perl 5.18 breaks pl/perl regression tests? |