Re: JDBC to load UTF8@psql to latin1@mysql

From: Emi Lu <emilu(at)encs(dot)concordia(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org, larry(dot)meadors(at)gmail(dot)com
Subject: Re: JDBC to load UTF8@psql to latin1@mysql
Date: 2012-12-14 18:37:19
Message-ID: 50CB71DF.5060004@encs.concordia.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello All,
>> Meh. That character renders as \310 in your mail, which is not an
>> assigned code in ISO 8859-1. The numerically corresponding Unicode
>> value would be U+0090, which is an unspecified control character.
>
> Oh, scratch that, apparently I can't do hex/octal arithmetic in my
> head first thing in the morning. It's really U+00C8 which is perfectly
> valid. I can't see a reason why that character and only that character
> would be problematic --- have you done systematic testing to confirm
> that that's the only should-be-LATIN1 character that fails?

Finally, the problem is resolved:

SHOW VARIABLES LIKE "character\_set\_%";
+--------------------------+--------+
| Variable_name | Value |
+--------------------------+--------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 | -- here mysql uses utf8 for
character_set_system.

Change my java code to:
========================
public static String utf8_to_mysql(String str)
throws Exception
{
try
{
byte[] convertStringToByte = str.getBytes("UTF-8");
str = new String(convertStringToByte,
"UTF-8");
return str;
}catch(Exception e)
{
log.error("utf8_to_latin1 Error: " + e.getMessage());
log.error(e);
throw e;
}

Have to explicitly specify "UTF-8", but cannot leave as empty.

Larry's comments(from MyBatis mailing list) and I tried both "from/to"
by "UTF8". It works. This is still little bit strange to me. But it works!

>> My guess is that it's correct but the client you're using is messing
>> it up. If not, then you need to look at your connection strings to
>> the 2 databases to make sure they are handling the encodings
>> correctly.Unless you set them specifically, I suspect they are using
>> your default system encoding - so both may be using utf8 or iso8859.

Thank you very much for all of your help for this!
Emi

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2012-12-14 18:58:27 Re: Read recover rows
Previous Message Tom Lane 2012-12-14 18:28:52 Re: initdb error