From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
---|---|
To: | Marc Millas <marc(dot)millas(at)mokadb(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: sort order |
Date: | 2021-08-06 15:57:51 |
Message-ID: | 3940ae46-b123-f58e-aa58-e6f9309b5ef6@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 27.07.21 19:07, Marc Millas wrote:
> so, obviously, both lc_collate knows about the é
> but obviously, too, they do behave differently on the impact of the
> beginning white space.
>
> I didn't see anything about this behaviour on the doc, unless the
> reference at the libc should be understood as please read and test libc
> doc on each platform.
> So my first question is: why ?
> My second question is: how to make the centos postgres behave like the
> w10 one ??
> ie. knowing about french characters AND taking beginning white spaces
> into account ?
There are multiple standard ways to deal with space and punctuation
characters when sorting. See
<https://unicode-org.github.io/icu/userguide/collation/customization/ignorepunct.html>
for a description. Not all collation providers implement all of them,
but the behavior you happen to get is usually one of them. The centos 7
behavior corresponds to "shift-trimmed", the Windows one appears to
match "non-ignorable". If you want to get that latter one on Linux as
well, you can use the ICU locales, which also default to non-ignorable.
For example
select * from test order by ble collate "fr-x-icu";
matches your Windows output for me.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-08-06 16:46:02 | Re: TLS 1.0 |
Previous Message | Tom Lane | 2021-08-06 13:50:34 | Re: psql's default database on connect (our internal ref. SRP-30861) |