Re: How to have ant's <sql> task insert special chars appropriately?

From: Richard Huxton <dev(at)archonet(dot)com>
To: agostonbejo <bejoag1(at)yahoo(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How to have ant's <sql> task insert special chars appropriately?
Date: 2009-09-24 17:46:47
Message-ID: 4ABBB087.7090600@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

agostonbejo wrote:
>
>
> Hi Richard,
>
> thanks for the answer! Nevertheless, see below... ;)
>
>> Richard Huxton wrote:
>> agostonbejo wrote:
>>> Hi!
>>>
>>> What I'm trying to do is to insert some data from a sql file into a
>> postgres
>>> DB by calling the <sql> ant task. My problem is that I can't get special
>>> characters (even if they can be represented by the standard ASCII
>> charset,
>>> such as ä, ö, ü, é, etc.) to be inserted correctly.
>> Those aren't ASCII.
>
> OK, probably my idea of what ASCII is is a bit too vague: by ASCII I simply
> meant the ISO-8859-1 charset. (Which might make further discussions about
> what exactly belongs to ASCII unnecessary...?)
>
> Eclipse (the editor which I'm using) says that the original SQL file's
> encoding is ISO-8859-1, the special characters are shown correctly, also in
> other text editors.

OK.

>> There are three places you need to get this right:
>> 1. The database encoding
>> 2. The client encoding
>> 3. The encoding of the contents of the .sql file
>>
>> Now, since the database is UTF8 that means it can accept the entire
>> range of unicode characters, including all ISO-8859-1.
>>
>> PostgreSQL can automatically convert from ISO-8859-1 to UTF-8 for you,
>> so it doesn't matter which you have in your .sql file.
>>
>> What *does* matter is that you know what encoding your .sql file is
>> using and that you set the client encoding appropriately.
>
> How do I set the client encoding to ISO-8859-1? As I wrote, the <sql> task
> complains if I set the client encoding to LATIN1 (which is the PostGres
> equivalent of ISO-8859-1 if I'm right) that the JDBC driver is not going to
> like it. (And so it seems indeed.)

Correct LATIN1 == ISO-8859-1. Can't help with the JDBC.

>> Since you're using Java, it's probably simplest just to use UTF-8 all
>> the way through. Crucially, make sure you know what the character-set of
>> the .sql file is - any good text editor should be able to tell you / set
>> this.
>
> As I wrote in my original post, I *have* tried using UTF-8 "all the way
> through" by converting the original ISO-8859-1 file to UTF-8 and calling the
> <sql> task with 'encoding="UTF-8"'. It didn't help, the special characters
> still became question marks. I've also set the client_encoding parameter in
> the sql file explicitly and I know, i.e., pgAdmin tells me the DB's encoding
> is UTF-8. (And it should be right, since *that* is able to insert special
> characters)
>
> So, to my best knowledge I got it right on all three places, and it still
> doesn't work. That's why I opened the topic in the first place.

Check again - something isn't right. Take the original ,sql file, save
it as UTF8 and add a line at the top "set client_encoding=utf8;"

Run this through psql and it should work fine. If not, then the database
isn't in utf8 after all.

Assuming it works, then something in your java setup isn't correct.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Reid Thompson 2009-09-24 18:02:59 Partitioned table question
Previous Message Sam Mason 2009-09-24 16:52:36 Re: generic modelling of data models; enforcing constraints dynamically...