| From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> | 
|---|---|
| To: | "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Character expansion with ICU collations | 
| Date: | 2021-06-09 17:54:54 | 
| Message-ID: | f7a2284c-9208-665e-d830-34e55e8d6f4d@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 09.06.21 17:31, Finnerty, Jim wrote:
> CREATE COLLATION CI_AS (provider = icu, 
> locale=’utf8(at)colStrength=secondary’, deterministic = false);
> 
> CREATE TABLE MyTable3
> (
> 
>      ID INT IDENTITY(1, 1),
>      Comments VARCHAR(100)
> 
> )
> 
> INSERT INTO MyTable3 (Comments) VALUES ('strasse')
> INSERT INTO MyTable3 (Comments) VALUES ('straße')
> SELECT * FROM MyTable3 WHERE Comments COLLATE CI_AS = 'strasse'
> SELECT * FROM MyTable3 WHERE Comments COLLATE CI_AS = 'straße'
> 
> We would like to control whether each SELECT statement finds both 
> records (because the sort key of ‘ß’ equals the sort key of ‘ss’), or 
> whether each SELECT statement finds just one record.
You can have these queries return both rows if you use an 
accent-ignoring collation, like this example in the documentation:
CREATE COLLATION ignore_accents (provider = icu, locale = 
'und-u-ks-level1-kc-true', deterministic = false);
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2021-06-09 17:58:08 | Re: Character expansion with ICU collations | 
| Previous Message | Peter Eisentraut | 2021-06-09 17:37:01 | Re: Adjust pg_regress output for new long test names |