Re: how is text-equality handled in postgresql?

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Ivan Voras <ivoras(at)freebsd(dot)org>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: how is text-equality handled in postgresql?
Date: 2014-01-15 12:29:01
Message-ID: CA+HiwqFKwftFjpyWaPPrn5Gv+37aGuL1qAozN2eGgjS-5YvOxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jan 15, 2014 at 9:02 PM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
> On 15/01/2014 12:36, Amit Langote wrote:
>> On Wed, Jan 15, 2014 at 7:39 PM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
>>> On 15/01/2014 10:10, Gábor Farkas wrote:
>>>> hi,
>>>>
>>>> when i create an unique-constraint on a varchar field, how exactly
>>>> does postgresql compare the texts? i'm asking because in UNICODE there
>>>> are a lot of complexities about this..
>>>>
>>>> or in other words, when are two varchars equal in postgres? when their
>>>> bytes are? or some algorithm is applied?
>>>
>>> By default, it is "whatever the operating system thinks it's right".
>>> PostgreSQL doesn't have its own collation code, it uses the OS's locale
>>> support for this.
>>>
>>
>> Just to add to this, whenever strcoll() (a locale aware comparator)
>> says two strings are equal, postgres re-compares them using strcmp().
>> See following code snippet off
>> src/backend/utils/adt/varlena.c:varstr_cmp() -
>
>> /*
>> * In some locales strcoll() can claim that
>> nonidentical strings are
>> * equal. Believing that would be bad news for a
>> number of reasons,
>> * so we follow Perl's lead and sort "equal" strings
>> according to
>> * strcmp().
>> */
>> if (result == 0)
>> result = strcmp(a1p, a2p);
>
> That seems odd and inefficient. Why would it be necessary? I would think
> indexing (and other collation-sensitive operations) don't care what the
> actual collation result is for arbitrary blobs of strings, as long as
> they are stable?
>

This is the behavior since quite some time introduced by this commit

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=656beff59033ccc5261a615802e1a85da68e8fad

--
Amit Langote

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ivan Voras 2014-01-15 13:00:50 Re: how is text-equality handled in postgresql?
Previous Message Ivan Voras 2014-01-15 12:02:00 Re: how is text-equality handled in postgresql?