Re: how is text-equality handled in postgresql?

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: how is text-equality handled in postgresql?
Date: 2014-01-15 12:02:00
Message-ID: lb5tbc$vo6$1@ger.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 15/01/2014 12:36, Amit Langote wrote:
> On Wed, Jan 15, 2014 at 7:39 PM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
>> On 15/01/2014 10:10, Gábor Farkas wrote:
>>> hi,
>>>
>>> when i create an unique-constraint on a varchar field, how exactly
>>> does postgresql compare the texts? i'm asking because in UNICODE there
>>> are a lot of complexities about this..
>>>
>>> or in other words, when are two varchars equal in postgres? when their
>>> bytes are? or some algorithm is applied?
>>
>> By default, it is "whatever the operating system thinks it's right".
>> PostgreSQL doesn't have its own collation code, it uses the OS's locale
>> support for this.
>>
>
> Just to add to this, whenever strcoll() (a locale aware comparator)
> says two strings are equal, postgres re-compares them using strcmp().
> See following code snippet off
> src/backend/utils/adt/varlena.c:varstr_cmp() -

> /*
> * In some locales strcoll() can claim that
> nonidentical strings are
> * equal. Believing that would be bad news for a
> number of reasons,
> * so we follow Perl's lead and sort "equal" strings
> according to
> * strcmp().
> */
> if (result == 0)
> result = strcmp(a1p, a2p);

That seems odd and inefficient. Why would it be necessary? I would think
indexing (and other collation-sensitive operations) don't care what the
actual collation result is for arbitrary blobs of strings, as long as
they are stable?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Amit Langote 2014-01-15 12:29:01 Re: how is text-equality handled in postgresql?
Previous Message saggarwal 2014-01-15 11:37:12 pg_depend OBJID not found