Re: PATCH: CITEXT 2.0

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: CITEXT 2.0
Date: 2008-07-07 07:46:10
Message-ID: 4871C9C2.8040307@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David E. Wheeler napsal(a):
> Replying to myself, but I've made some local changes (see other
> messages) and just wanted to follow up on some of my own comments.
>
> On Jul 2, 2008, at 21:38, David E. Wheeler wrote:
>
>>> 4) Operator = citext_eq is not correct. See comment
>>> http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac
>>>
>>
>> So should citextcmp() call strncmp() instead of varst_cmp()? The
>> latter is what I saw in varlena.c.
>
> I'm guessing that the answer is "no," since varstr_cmp() uses strncmp()
> internally, as appropriate to the locale. Correct?

You have to use varstr_cmp in citextcmp. Your code is correct, because for
< <= >= > operators you need collation sensible function.

You need to change only citext_cmp function to use strncmp() or call texteq
function.

>>> There must be difference between equality and collation for example
>>> in Czech language 'láska' and 'laská' are different word it means
>>> that 'láska' != 'laská'. But there is no difference in collation
>>> order. See Unicode Universal Collation Algorithm for detail.
>>
>> I'll leave the collation stuff to the functions I call (*far* from my
>> specialty), but I'll add a test for this and make sure it works as
>> expected. Um, although, with what collation should it be tested? The
>> tests I wrote assume en_US.UTF-8.
>
> I added this test and is passes:
>
> SELECT isnt( 'láska'::citext, 'laská'::citext, 'Diffrent accented
> characters should not be equivalent' );

I'm think that this test will work correctly for en_US.UTF-8 at any time. I
guess the test make sense only when Czech collation (cs_CZ.UTF-8) is selected,
but unfortunately, you cannot change collation during your test :(.

I think, Best solution for now is to keep the test and add comment about
recommended collation for this test.

Zdenek

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-07-07 13:07:14 Re: pg_ctl -w with postgresql.conf in non-default path
Previous Message Yoshiyuki Asaba 2008-07-07 07:22:21 Re: [PATCHES] WITH RECURSIVE updated to CVS TIP