From: | chap at anastigmatix(dot)net (Chapman Flack) |
---|---|
To: | |
Subject: | [Pljava-dev] PL/java kills unicode chars? |
Date: | 2015-09-20 03:41:04 |
Message-ID: | 55FE2AD0.1040305@anastigmatix.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pljava-dev |
Srivatsan Ramanujam wrote:
> I believe PL/java is killing unicode characters (it is probably converting
> text to a byte stream and reading them as single byte characters - perhaps
> Latin-1 and not as UTF-8). ...
Well, this one has been open for a while. It's also in the github tracker,
https://github.com/tada/pljava/issues/21 which I've just updated with
confirming test code.
I think vatsan's comment (on the github issue) about
http://bugs.sun.com/view_bug.do?bug_id=5030776 is probably spot on.
We seem to handle the whole basic multilingual plane correctly
(nearly 64k codepoints), it's just all the planes above that getting
messed up.
Should be a straightforward fix, once I've found how many places those
not-really-UTF JNI functions are really used in the code.
-Chap
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2015-09-20 13:33:26 | [Pljava-dev] I remembered why we might want bytecode scalar types |
Previous Message | Chapman Flack | 2015-09-12 21:35:06 | [Pljava-dev] Pl/Java package in Ubuntu |