Re: Very strange Error in Updates

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: "Dario V(dot) Fassi" <software(at)sistemat(dot)com(dot)ar>
Cc: Kris Jurka <books(at)ejurka(dot)com>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Very strange Error in Updates
Date: 2004-07-15 06:48:58
Message-ID: 40F628DA.5010002@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

Dario V. Fassi wrote:
> Server 7.3.4 for W2K and Linux too.
> Encoding SQL_ASCII in both cases.
>
> I understand the source of the problem , but the ASCII encoding are
> not 7 bits , it has 8 bits with international charsets in codepages,
> like values in examples.
> You are talking about US-ASCII charset , that is a Unicode subset of 7 bits.

You're arguing over nomenclature here. At the end of the day, a
postgresql database encoding of SQL_ASCII means 7-bit ASCII; if you call
that US-ASCII, fine, but it doesn't change the problem. With an encoding
of SQL_ASCII the server does not have sufficient information to
translate characters >127 between the database encoding and UNICODE,
which is required by the JDBC driver (even if the JDBC driver did not
set client_encoding to UNICODE, it'd still have to somehow do this
translation itself since Java strings are represented as UTF-16).

See http://www.postgresql.org/docs/current/static/multibyte.html for
some more details. The JDBC driver will always use a "client character
set" of UNICODE when talking to a >= 7.3 server.

> No matter that , and speaking in CHARS , if I'm putting a 30 chars
> length string at a field of 30 chars length ,
> I think that the driver can/must assure, a 30 chars length string transfer.
> May be a "data truncation" warning can be acceptable, or a replacement
> byte/char, or cutting the eight bit ,
> but it's no sufficient reason to abort the update.
>
> What 's your opinion ?

The server already does a replacement -- the problem is that the
replacement may be longer than one character (see the referenced docs
above for handling of unrepresentable characters). So the server-side
representation of a "30 character" Java string may actually be longer
than 30 characters in the database encoding.

Either way there's nothing the driver can really do about it -- we don't
want to duplicate all the knowledge about charset conversions on the
driver side (currently, the driver does know some details about
encodings, but that's only there to support pre-7.3 servers). We just
hand off a valid UNICODE string and let the server deal with it. If the
server generates an error and aborts the transaction -- too bad, it's
not the driver's fault.

The best option is to fix your database encoding; UNICODE is your best
bet if you're only talking to it via JDBC. If you really want silent
truncation (bad idea!) you can get that via an explicit cast to
varchar(30) in your query.

-O

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2004-07-15 07:15:03 Re: [HACKERS] Point in Time Recovery
Previous Message Dario V. Fassi 2004-07-15 06:34:47 Re: Very strange Error in Updates

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dario Fassi 2004-07-15 07:23:01 Re: Very strange Error in Updates
Previous Message Dario V. Fassi 2004-07-15 06:34:47 Re: Very strange Error in Updates