Re: COPY support in pgsql-jdbc driver

From: Barry Lind <barry(at)xythos(dot)com>
To: Sam Varshavchik <mrsam(at)courier-mta(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: COPY support in pgsql-jdbc driver
Date: 2002-06-20 17:11:49
Message-ID: 3D120CD5.9070401@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Sam Varshavchik wrote:

> Barry Lind writes:
>
>> Sam Varshavchik wrote:
>>
>>> What's being dumped and reloaded here is a byte-stream
>>> (InputStream/OutputStream), not a character-stream (Reader/Writer).
>>> Presumably, the only thing that's ever going to be reloaded
>>> something that was dumped previously, so no conversions are necessary.
>>
>>
>> This is not correct. The data coming from the server is a stream of
>> characters in the character encoding of the server. This character
>> encoding may be different than the client character encoding, and
>> therefore character set conversions are necessary. Lets say for
>> example the database is running with UTF-8 as it's character set,
>> thus the output of the copy will be UTF-8 encoded. If the client is
>> running Latin1 then there will be a missmatch and all 8bit characters
>> will be interpreted incorrectly by the client. Character set
>> conversion is necessary in this case.
>
>
> That only matters if you actually want to do something with the dumped
> data. If all you want is to be able to reload it later, why bother
> converting charset A to B, only to have it converted from B to A later?

True, but if you want to look at the data in an editor you might like it
to be in a character set you can view, also if you are going to reload
the data through a different client (i.e. psql) you won't easily be able
to as it will assume the client encoding. I said earlier that if the
client encoding = server encoding this should be a noop. So you would
always have the ability to have a higher level of performance by
specifying the same encoding as the server. I also suggested that there
be a method that didn't take an encoding which would use a default
encoding. My recomendation was that the default be the default encoding
for the running jvm, but it could default to the same encoding as the
server. However since all the jdk methods that I am aware of which have
an optional encoding argument default to the jvm encoding, I think it
would be confusing to do something different here. But I could be
convinced otherwise.

thanks,
--Barry

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Craig Moon 2002-06-20 17:38:12 referential integrity violation makes connection pool useless
Previous Message Sam Varshavchik 2002-06-20 16:35:57 Re: COPY support in pgsql-jdbc driver