Re: Unicode confusion

From: Ian Barwick <barwick(at)gmx(dot)net>
To: "Chris Palmer" <chris(dot)palmer(at)geneed(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Unicode confusion
Date: 2003-05-13 06:13:28
Message-ID: 200305130813.28878.barwick@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tuesday 13 May 2003 00:35, Chris Palmer wrote:
(...)
> ===
> ps = new PrintStream(System.out, true, "UTF-8");
> ...
> // this line might look strange to you if your mailer shows it differently
> than mine does: s.executeUpdate("INSERT INTO test (chug) VALUES
> ('¤ä´©¬O¬°¤FÅý')"); s.executeUpdate("INSERT INTO test (chug) VALUES
> ('testing')");
> s.executeUpdate("INSERT INTO test (chug) VALUES ('\u262f\u0b87')");
> ...
> ps.println(rs.getString("chug"));
> ===
>
> I'm no Java expert, so if that's not a good way to get UTF-8-encoded
> output, please let me know. When I try it, I get:
>
> ===
>
> > java Noodle > goo
> > cat goo
>
> ¤ä´©¬O¬°¤FÃ
> ý
> testing
> â¯à®
> ===
>
> I installed KDE on our Linux machine (the one running Java and Pg) and got
> the similar results using konsole. (Fwiw I am using PuTTY on Windows to
> connect to Linux).
>
> ===
> ¤ä´©¬O¬°¤FÃý
> testing
> â¯à®
> ===
>
> Note the lack of the newline in the middle of the first result.
>
> In either case, konsole or PuTTY, I am not getting back what I put in (the
> first s.executeUpdate(...), above).

Err, yes you are. Just encoded differently (UTF-8 vs. whatever Java
uses, I would guess UCS2 or UTF16). The bytes are now getting dumped to the
display, just the display does not know that they are UTF-8. Before starting
konsole you may need to set your locale. (No idea whether putty is Unicode
capable).

> In psql, the result of "select * from test" looks the same as it does when
> output by the Noodle Java program.
>
> Fwiw, I do have the encoding of this database set to UNICODE:

This is expected behaviour. Have you looked to see what encoding
Postgres uses to store Unicode?

Anyway, the obvious question is: have you tried printing the strings
you are currently passing through Postgres directly?
( ps.println('\u262f\u0b87'); ?) Do they appear any differently?

Ian Barwick
barwick(at)gmx(dot)net

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Johann Uhrmann 2003-05-13 09:12:53 list for codes returned by getErrorCode()?
Previous Message Devrim GUNDUZ 2003-05-13 05:57:56 Re: .NET and PostgreSQL