From: | Thomas Kellerer <spam_eater(at)gmx(dot)net> |
---|---|
To: | pgsql-jdbc(at)postgresql(dot)org |
Subject: | Re: getTables() doesn't handle umlauts correctly |
Date: | 2010-11-23 14:22:10 |
Message-ID: | icgimh$46u$1@dough.gmane.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
Kris Jurka, 23.11.2010 09:13:
> As the discussion has shown, trying to determine who is at fault here
> is not trivial. The best way to show that postgresql (driver or
> server if you're seeing it in pgadmin too) is at fault is to create a
> test case creating the table and then querying the metadata. It would
> be helpful to use either a Java or PG escape code for the special
> character so it doesn't get mangled by either mail clients or build
> environments. Then use String.codePointAt to print out the actual
> data for both the table name used for construction and returned by
> the metadata. That would conclusively show that PG is at fault
> somewhere.
OK, this is my test program:
Connection con = DriverManager.getConnection("jdbc:postgresql://localhost:5432/postgres", "postgres", "postgres");
Statement stmt = con.createStatement();
stmt.executeUpdate("create table umlaut_ö (some_data varchar(10))");
stmt.executeUpdate("insert into umlaut_ö (some_data) values ('öäü')");
ResultSet rs = con.getMetaData().getTables(null, "public", "umlaut%", null);
if (rs.next()) {
String name = rs.getString("TABLE_NAME");
System.out.println("table name: " + name);
System.out.print(" codepoints:");
for (int i = 0; i < name.length();)
{
int cp = name.codePointAt(i);
System.out.print(" " + cp);
i += Character.charCount(cp);
}
System.out.println("");
}
rs.close();
rs = stmt.executeQuery("select count(*) from umlaut_ö where some_data = 'öäü'");
if (rs.next()) {
int count = rs.getInt(1);
System.out.println("number of rows: " + count);
}
rs.close();
rs = stmt.executeQuery("select some_data from umlaut_ö");
if (rs.next()) {
String data = rs.getString(1);
System.out.println("data: " + data);
System.out.print(" codepoints:");
for (int i = 0; i < data.length();)
{
int cp = data.codePointAt(i);
System.out.print(" " + cp);
i += Character.charCount(cp);
}
System.out.println("");
}
rs.close();
stmt.executeUpdate("drop table umlaut_ö");
stmt.close();
con.close();
The output on my computer is:
table name: umlaut_test_�
codepoints: 117 109 108 97 117 116 95 116 101 115 116 95 65533
number of rows: 1
data: öäü
codepoints: 246 228 252
So it seems that the umlauts in the table name are returned with a different encoding than the data itself.
Nevertheless the umlauts when being *sent* to the server are always treated correctly (as part of a table name as well as column values)
This is with 9.0.1 on Windows XP using postgresql-9.0-801.jdbc4.jar
Regards
Thomas
From | Date | Subject | |
---|---|---|---|
Next Message | Radosław Smogura | 2010-11-23 14:31:03 | Re: TypeInfoCache.getPGArrayElement - determine if array |
Previous Message | Kris Jurka | 2010-11-23 08:13:42 | Re: getTables() doesn't handle umlauts correctly |