Quick Links

Re: sort order

From:	Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To:	Marc Millas <marc(dot)millas(at)mokadb(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: sort order
Date:	2021-08-06 15:57:51
Message-ID:	3940ae46-b123-f58e-aa58-e6f9309b5ef6@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 27.07.21 19:07, Marc Millas wrote:
> so, obviously, both lc_collate knows about the é
> but obviously, too, they do behave differently on the impact of the
> beginning white space.
>
> I didn't see anything about this behaviour on the doc, unless the
> reference at the libc should be understood as please read and test libc
> doc on each platform.
> So my first question is: why ?
> My second question is: how to make the centos postgres behave like the
> w10 one ??
> ie. knowing about french characters AND taking beginning white spaces
> into account ?

There are multiple standard ways to deal with space and punctuation
characters when sorting. See
<https://unicode-org.github.io/icu/userguide/collation/customization/ignorepunct.html>
for a description. Not all collation providers implement all of them,
but the behavior you happen to get is usually one of them. The centos 7
behavior corresponds to "shift-trimmed", the Windows one appears to
match "non-ignorable". If you want to get that latter one on Linux as
well, you can use the ICU locales, which also default to non-ignorable.
For example

select * from test order by ble collate "fr-x-icu";

matches your Windows output for me.

In response to

sort order at 2021-07-27 17:07:43 from Marc Millas

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2021-08-06 16:46:02	Re: TLS 1.0
Previous Message	Tom Lane	2021-08-06 13:50:34	Re: psql's default database on connect (our internal ref. SRP-30861)