From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Per-column collation |
Date: | 2010-11-16 19:00:47 |
Message-ID: | AANLkTimbcnWjUHKGGZZRgiptSLXNwfXT2MCsEFVMzUM6@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
2010/11/16 Peter Eisentraut <peter_e(at)gmx(dot)net>:
> On mån, 2010-11-15 at 23:13 +0100, Pavel Stehule wrote:
>> a) default encoding for collate isn't same as default encoding of database
>>
>> it's minimally not friendly - mostly used encoding is UTF8, but in
>> most cases users should to write locale.utf8.
>
> I don't understand what you are trying to say. Please provide more
> detail.
go down.
>
>> b) there is bug - default collate (database collate is ignored)
>>
>>
>> postgres=# show lc_collate;
>> lc_collate
>> ────────────
>> cs_CZ.UTF8
>> (1 row)
>>
>> Time: 0.518 ms
>> postgres=# select * from jmena order by v;
>> v
>> ───────────
>> Chromečka
>> Crha
>> Drobný
>> Čečetka
>> (4 rows)
>>
>> postgres=# select * from jmena order by v collate "cs_CZ.utf8";
>> v
>> ───────────
>> Crha
>> Čečetka
>> Drobný
>> Chromečka
>> (4 rows)
>>
>> both result should be same.
>
> I tried to reproduce this here but got the expected results. Could you
> try to isolate a complete test script?
>
I can't to reproduce now too. On different system and comp. Maybe I
did some wrong. Sorry.
>> isn't there problem in case sensitive collate name? When I use a
>> lc_collate value, I got a error message
>>
>> postgres=# select * from jmena order by v collate "cs_CZ.UTF8";
>> ERROR: collation "cs_CZ.UTF8" for current database encoding "UTF8"
>> does not exist
>> LINE 1: select * from jmena order by v collate "cs_CZ.UTF8";
>>
>> problem is when table is created without explicit collate.
>
> Well, I agree it's not totally nice, but we have to do something, and I
> think it's logical to use the locale names as collation names by
> default, and collation names are SQL identifiers. Do you have any ideas
> for improving this?
yes - my first question is: Why we need to specify encoding, when only
one encoding is supported? I can't to use a cs_CZ.iso88592 when my db
use a UTF8 - btw there is wrong message:
yyy=# select * from jmena order by jmeno collate "cs_CZ.iso88592";
ERROR: collation "cs_CZ.iso88592" for current database encoding
"UTF8" does not exist
LINE 1: select * from jmena order by jmeno collate "cs_CZ.iso88592";
^
I don't know why, but preferred encoding for czech is iso88592 now -
but I can't to use it - so I can't to use a names "czech", "cs_CZ". I
always have to use a full name "cs_CZ.utf8". It's wrong. More - from
this moment, my application depends on firstly used encoding - I can't
to change encoding without refactoring of SQL statements - because
encoding is hardly there (in collation clause).
So I don't understand, why you fill a table pg_collation with thousand
collated that are not possible to use? If I use a utf8, then there
should be just utf8 based collates. And if you need to work with wide
collates, then I am for a preferring utf8 - minimally for central
europe region. if somebody would to use a collates here, then he will
use a combination cs, de, en - so it must to use a latin2 and latin1
or utf8. I think so encoding should not be a part of collation when it
is possible.
Regards
Pavel
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-11-16 19:04:14 | Re: autovacuum maintenance_work_mem |
Previous Message | Alvaro Herrera | 2010-11-16 18:58:13 | Re: unlogged tables |