From: | "David E(dot) Wheeler" <david(at)kineticode(dot)com> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: CITEXT 2.0 |
Date: | 2008-07-06 00:46:39 |
Message-ID: | DD2B2B80-66FD-46B7-9D5B-0AF94C264E55@kineticode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jul 5, 2008, at 02:58, Gregory Stark wrote:
>> txt = cilower( PG_GETARG_TEXT_PP(0) );
>> str = VARDATA_ANY(txt);
>>
>> result = hash_any((unsigned char *) str, VARSIZE_ANY_EXHDR(txt));
>
> I thought your data type implemented a locale dependent collation,
> not just
> a case insensitive collation. That is, does this hash agree with your
> citext_eq on strings like "foo bar" <=> "foobar" and "fooß" <=>
> "fooss" ?
CITEXT is basically intended to replace all those queries that do
`WHERE LOWER(col) = LOWER(?)` by doing it internally. That's it. It's
locale-aware to the same extent that `LOWER()` is (and that citext 1.0
is not, since it only compares ASCII characters case-insensitively).
And I expect that it does, in fact, agree with your examples, in that
all the current tests for = and <> pass:
try=# select 'foo bar' = 'foobar';
?column?
----------
f
try=# SELECT 'fooß' = 'fooss';
?column?
----------
f
> You may have to use strxfrm
In the patch against CVS HEAD, it uses str_tolower() in formatting.c.
Best,
David
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2008-07-06 00:46:52 | Re: PATCH: CITEXT 2.0 |
Previous Message | Ron Mayer | 2008-07-06 00:40:06 | Re: A Windows x64 port of PostgreSQL |