From: | pgsql-bugs(at)postgresql(dot)org |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Bug #659: lower()/upper() bug on ->multibyte<- DB |
Date: | 2002-05-07 14:51:12 |
Message-ID: | 20020507145112.BE39A476356@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Michael Enke (michael(dot)enke(at)wincor-nixdorf(dot)com) reports a bug with a severity of 2
The lower the number the more severe it is.
Short Description
lower()/upper() bug on ->multibyte<- DB
Long Description
OS: Linux Kernel 2.4.4, PostgreSQL version 7.2.1
lower() and upper() doesn't work like expected for multibyte
databases. It is working fine for one-byte encoding.
The behaviour can be reproduced as follows:
at initdb: LC_CTYPE was set to de_DE
createdb -E UTF-8 name
export PGCLIENTENCODING=LATIN1
psql -U name
--------------------------------------------------
=> select lower(''); -- german umlaut A, capital
ERROR: Could not convert UTF-8 to ISO8859-1
-- I expected to see: german umlaut a, lower case
--------------------------------------------------
=> select lower(''); -- german umlaut a, lower case
ERROR: Could not convert UTF-8 to ISO8859-1
-- I expected to see: german umlaut a, lower case
--------------------------------------------------
=> select upper(''); -- it doesn't translate
-- I expected to see:
--------------------------------------------------
=> select upper(''); -- this works fine
--------------------------------------------------
The same happens to and (O umlaut, U umlaut)
If you want to reproduce this and don't have / on your keyboard,
you can create a table with one column, type varchar(1) (on a MB DB).
create a file with following input:
ae is \u00e4
AE is \u00c4
from java use the command:
native2ascii -reverse -utf8 <this-file> <new-file>
In <new-file> you will see:
in the first line 2 bytes: A(with tilde on top) and Euro Symbol,
in the second line 2 byte: A(with tilde on top) and a dotted box
unset PGCLIENTENCODING, call psql:
insert into table values('<copy and paste first two bytes>');
insert into table values('<copy and paste second two bytes>');
export PGCLIENTENCODING=LATIN1
psql: select * from table; will show you the a-umlaut and A-umlaut.
Sample Code
No file was uploaded with this report
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2002-05-07 15:55:48 | Re: problem with the sum function |
Previous Message | Tom Lane | 2002-05-07 14:37:56 | Re: problem with the sum function |
From | Date | Subject | |
---|---|---|---|
Next Message | mlw | 2002-05-07 14:58:29 | Re: OK, lets talk portability. |
Previous Message | Marc G. Fournier | 2002-05-07 14:50:50 | Re: OK, lets talk portability. |