From: | Jan Ploski <jpljpl(at)gmx(dot)de> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Unicode database + JDBC driver performance |
Date: | 2003-01-01 19:07:17 |
Message-ID: | 3939074.1041448037948.JavaMail.jpl@remotejava |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
Here is my UTF8Encoder class mentioned on pgsql-general.
It should be put into org/postgresql/core, and you will also
need to patch Encoding.java, so that it uses this class:
if (encoding.equals("UTF-8")) {
return UTF8Encoder.encode2(s);
}
else {
return s.getBytes(encoding);
}
There are two public utility methods in UTF8Encoder, encode1 and encode2.
They use two different approaches to determining how big the output
buffer should be. Performance-wise they seem very similiar (encode2
being a bit slower), but I favor encode2 because it does less memory
allocation and copying.
Note that I did not use any shared buffer in order to avoid
synchronization of multiple threads (as I understand, the class
Encoding must ensure thread safety itself). This may be an unnecessary
concern after all... I don't know.
UTF8Encoder can be used as is, or made into a private static inner
class of Encoding.java, whatever you prefer.
UTF8Encoder.main contains some tests to assert that it stays compatible
with Java's built-in encoder. It may be nicer to move them into
a JUnit test case, you decide.
Take care -
JPL
Attachment | Content-Type | Size |
---|---|---|
UTF8Encoder.tar.gz | application/octet-stream | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-01-01 19:34:53 | Re: compiling 7.3 on RH7.2 |
Previous Message | Jan Ploski | 2003-01-01 19:07:13 | Re: Unicode database + JDBC driver performance |