From: | Ken Sell <ksell(at)greenplum(dot)com> |
---|---|
To: | Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> |
Cc: | "pgsql-odbc(at)postgresql(dot)org" <pgsql-odbc(at)postgresql(dot)org> |
Subject: | Re: DataDirect Driver, ExecDirect and UTF-8 |
Date: | 2009-05-14 22:17:34 |
Message-ID: | C631E68E.133%ksell@greenplum.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-odbc |
Hi,
Thanks very much for the information. I think I have it figured out.
Here is my understanding. Feel free to correct me if I have gotten something wrong.
In order for everything to work correctly with a UTF-8 application, a DataDirect driver manager, and psqlodbc you must do the following.
- Use the header files (e.g. sql.h) from the DataDirect driver manager installation.
- Compile the driver with the SQLWCHARSHORT define (e.g. #define SQLWCHARSHORT=1).
This will cause SQLWCHAR to be defined as an unsigned short.
- Compile the application with the SQLWCHARSHORT define (e.g. #define SQLWCHARSHORT=1).
This will cause SQLWCHAR to be defined as an unsigned short.
- Set the DriverUnicodeType data source attribute to 1 (i.e. DriverUnicodeType=1 in your odbc.ini file).
This tells the driver manager that your driver is expecting UTF-16 encoding for all SQLWCHAR
parameters.
- If your database encoding is UTF-8, the descriptor for character columns will be a wide type.
For example, a char(1) column will be SQL_WCHAR.
- If you do something with a sql statement (e.g. Do a SQLExecDirectW("insert into T1 values ('a')),
the driver manager (DataDirect) will convert the string from UTF-8 to UTF-16, pass it to the
the driver as UTF-16. The driver will then translate it back to UTF-8. When the driver is
unicode (i.e your compiled psqlodbc with as a unicode driver), then this translation
in the driver manager happens weather or not you use a wide function. The driver manager
converts both SQLCHAR and SQLWCHAR parameters to UTF-16. It also changes all calls to
non-wide functions to calls to wide functions to the driver (e.g. SQLExecDirect gets changed
to SQLExecDirectW). Note, SQLCHAR and SQLWCHAR strings are not translated from UTF-8
to UTF-16 in the same way. If the call is to a non-wide function, then the translation assumes
that the string contains only ASCII characters. A non-ASCII character will probably get
translated incorrectly. A call to wide functions translates UTF-8 to UTF-16 correctly.
- When using parameters, you must be use the appropriate SQL and C type when describing
the parameters column values. If you say the C type is SQL_C_CHAR, the value is assumed
to be in UTF-8 by the driver for a database encoded in UTF-8. IF you say the C type is
SQL_C_WCHAR, I think the driver will assume the value is in UTF-16 and try to translate
it to UTF-8.
- When using column values, you must use the appropriate SQL C type. If you say the column C
type is SQL_C_WCHAR, then the returned value will be translated to UTF-16 by the driver.
If you say the column C type is SQL_C_CHAR, then the value will be returned as UTF-8.
....Ken
On 5/5/09 11:13 PM, "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp> wrote:
Ken Sell wrote:
> Hi,
>
> I wasn't a member of the mailing list when I sent this, so I'm not sure
> it actually made it out there.
> I apologize if this is a duplicate.
>
> ....Ken
>
>
> On 5/4/09 11:49 AM, "Ken Sell" <ksell(at)greenplum(dot)com> wrote:
>
> Hi,
>
> I'm a the new connectivity developer at GreenPlum. GreenPlum makes a
> data warehouse DBMS based on PostgreSQL.
> I'm working on a problem where a user is attempting to insert a
> non-ASCII UTF-8 values (i.e. An umlaut over an 'o').
> The test does an insert via an SQLExecDirectW into a table. The
> text looks like this:
>
> "insert into t1 values ('ö')"
>
> I've built and debugged the postgreSQL driver (version 8.02.0500).
> It looks like the text makes it through the
> Driver Manager (i.e. DataDirect) ok. I see the correct value in
> SQLExecDirectW in odbcapiw.c, but I
> also see the code in SQLExecDirectW call ucs2_to_utf8. ucs2_to_utf8
> tries to interpret the value as
> UCS2, but the value is UTF-8. The value is corrupted by ucs2_to_utf8.
Psqlodbc Unicode driver uses UTF-16 encoding and your application uses
UTF-8 encoding. Isn't the URL
http://media.datadirect.com/download/docs/odbc/allodbc/reference/unicode6.html
related to your problem ?
regards,
Hiroshi Inoue
From | Date | Subject | |
---|---|---|---|
Next Message | V S P | 2009-05-17 05:58:53 | [Q] SQLMoreResults causes error in SQLFetchScroll |
Previous Message | imad | 2009-05-14 14:52:01 | Re: psqlodbc driver returning empty string |