Re: case insensitive collation of Greek's sigma

From: Gianni Ceccarelli <dakkar(at)thenautilus(dot)net>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Cc: Jakub Jedelsky <jakub(dot)jedelsky(at)gooddata(dot)com>
Subject: Re: case insensitive collation of Greek's sigma
Date: 2021-12-02 14:04:04
Message-ID: 20211202140404.5f8a7d23@exelion
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I realise this may not be applicable to the original problem, but
non-deterministic collations seems to offer a solution::

dakkar(at)[local] dakkar=> create collation "en-US-ins-icu" (
provider=icu,
locale='en-US-u-ks-level2',
deterministic=false
);

dakkar(at)[local] dakkar=> select 'ΣΣ' = 'σσ' collate "en-US-ins-icu";
┌──────────┐
│ ?column? │
├──────────┤
│ t │
└──────────┘
(1 row)

dakkar(at)[local] dakkar=> select 'ΣΣ' = 'σς' collate "en-US-ins-icu";
┌──────────┐
│ ?column? │
├──────────┤
│ t │
└──────────┘
(1 row)

dakkar(at)[local] dakkar=> select 'ΣΣ' = 'α' collate "en-US-ins-icu";
┌──────────┐
│ ?column? │
├──────────┤
│ f │
└──────────┘
(1 row)

Notice, though:

* I don't understand what that ``-u-`` is doing in ``locale``, but
it's necessary
* as the docs
https://www.postgresql.org/docs/13/collation.html#COLLATION-NONDETERMINISTIC
say:

- B-tree cannot use deduplication with indexes that use a
nondeterministic collation
- certain operations are not possible with nondeterministic
collations, such as pattern matching operations (this means you
can't use ``LIKE``)

--
Dakkar - <Mobilis in mobile>
GPG public key fingerprint = A071 E618 DD2C 5901 9574
6FE2 40EA 9883 7519 3F88
key id = 0x75193F88

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alan Stange 2021-12-02 16:05:50 pg_upgrade question
Previous Message Jakub Jedelsky 2021-12-02 13:26:39 Re: case insensitive collation of Greek's sigma