| From: | Martin Flahault <martin(at)billjobs(dot)com> | 
|---|---|
| To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> | 
| Cc: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Collate order on Mac OS X, text with diacritics in UTF-8 | 
| Date: | 2010-01-13 15:15:06 | 
| Message-ID: | 2BAC69E9-7738-4F03-A149-83DC9F80729C@billjobs.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Here is an exemple :
postgres=# create database newbase;
CREATE DATABASE
postgres=# \c newbase;
psql (8.4.2)
You are now connected to database "newbase".
newbase=# create table t1 (contenu text);
CREATE TABLE
newbase=# insert into t1 values ('a'), ('e'), ('à'), ('é'), ('A'), ('E');
INSERT 0 6
newbase=# select * from t1 order by contenu;
 contenu 
---------
 A
 E
 a
 e
 à
 é
(6 rows)
newbase=# select * from t1 order by upper(contenu);
 contenu 
---------
 a
 A
 e
 E
 à
 é
(6 rows)
Here is the encoding informations :
newbase=# \encoding
UTF8
newbase=# show lc_collate;
 lc_collate 
------------
 fr_FR
(1 row)
newbase=# show lc_ctype;
 lc_ctype 
----------
 fr_FR
(1 row)
As with others DBMS (MySQL for example), diacritics should be ignored when determining the sort order. Here is the expected output:
 a
 à
 A
 e
 é 
 E
It seems there is a problem with the collating order on BSD systems with diacritics using UTF8.
If you put this text :
a
A
à
é
e
E
in a UTF8 text file and use the "sort" command on it, you will have the same wrong output as with PostgreSQL :
A
E
a
e
à
é
Hope this will help,
Martin
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Smith | 2010-01-13 15:58:44 | Re: postgresql 8.1 windows 2008 64 bit | 
| Previous Message | Vincenzo Romano | 2010-01-13 15:15:04 | Re: R: Re: Weird EXECUTE ... USING behaviour |