From: | "Sergiy Vyshnevetskiy" <serg(at)vostok(dot)net> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #2685: Wrong charset of server messages on client [PATCH] |
Date: | 2006-10-10 14:55:29 |
Message-ID: | 200610101455.k9AEtTTd085210@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 2685
Logged by: Sergiy Vyshnevetskiy
Email address: serg(at)vostok(dot)net
PostgreSQL version: 8.1
Operating system: FreeBSD-6 stable
Description: Wrong charset of server messages on client [PATCH]
Details:
DESCRIPTION:
PostgreSQL backend uses gettext() to localize its messages. The charset of
localized messages is determined by LC_CTYPE by default.
Then the message is processed through sprintf-like mechanism (with database
data as possible arguments) and fed to send_message_to_frontend(), that
converts data from _database_charset_(!) to client charset.
If LC_CTYPE is not the same as (at least binary compatible to) database
charset, then client gets garbage characters in server messages. If database
charset is UTF-8, then cluster may recusively generate "invalid byte
sequence for encoding" errors till it fills up
errordata[ERRORDATA_STACK_SIZE], then it panics.
SOLUTION:
Convert server messages to database charset.
PATCH:
--- src/backend/utils/mb/mbutils.c.o0 Tue Oct 10 11:51:13 2006
+++ src/backend/utils/mb/mbutils.c Tue Oct 10 11:49:22 2006
@@ -615,6 +615,7 @@
DatabaseEncoding = &pg_enc2name_tbl[encoding];
Assert(DatabaseEncoding->encoding == encoding);
#ifdef USE_ICU
+
bind_textdomain_codeset("postgres",(&pg_enc2iananame_tbl[encoding])->name);
ucnv_setDefaultName((&pg_enc2iananame_tbl[encoding])->name);
#endif
}
This, however, uncovers another bug: PostgreSQL dumps the messages into
stderr/syslog as-is, without converting database data from database charset
to charset from LC_MESSAGES. After this patch it will do so with message
text too. The fix should be trivial - set up a conversion from database
charset to server charset. I will post a patch for it later.
NOTE:
I used pg_enc2iananame_tbl instead of pg_enc2name_tbl, because gettext
doesn't accept many
Possible TODO:
Change PostgreSQL charset names to IANA-standard names.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-10-10 14:58:41 | Re: BUG #2684: Memory leak in libpq |
Previous Message | Milen A. Radev | 2006-10-10 10:22:35 | BUG #2684: Memory leak in libpq |