Re: Losing my latin on Ordering...

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Dominique Devienne <ddevienne(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Losing my latin on Ordering...
Date: 2023-02-14 11:40:46
Message-ID: 2d3c66e7c075ae9efe691eaa3b1040c6ce393ed7.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 2023-02-14 at 12:17 +0100, Dominique Devienne wrote:
> On Tue, Feb 14, 2023 at 11:23 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > On Tue, 2023-02-14 at 10:31 +0100, Dominique Devienne wrote:
> > > Surely sorting should be "constant left-to-right", no? What are we missing?
> >
> > No, it isn't.  That's not how natural language collations work.
>
> Honestly, who expects the same prefix to sort differently based on what comes
> after, in left-to-right languages?
> How does one even find out what the (capricious?) rules for sorting in a given
> collation are?

Look at the documentation / implementation.

As far as ICU is concerned, here: https://unicode.org/reports/tr10/

> > > I'm already surprised (star) comes before (space), when the latter "comes
> > > before" the former in both ASCII and UTF-8, but that the two "Foo*" and "Foo "
> > > prefixed pairs are not clustered after sorting is just mistifying to me. So how come?
> >
> > Because they compare identical on the first three levels.  Any difference in
> > letters, accents or case weighs stronger, even if it occurs to the right
> > of these substrings.
>
> That's completely unintuitive...

Well, you can complain to GNU and the Unicode consortium, but that's pretty
much the way it is.

> > Yes, it soulds like the "C" collation may be best for you.  That is, if you don't
> > mind that "Z" < "a".
>
> I would mind if I asked for case-insensitive comparisons.
>
> So the "C" collation is fine with general UTF-8 encoding?
> I.e. it will be codepoint ordered OK?

Yes, exactly.

Yours,
Laurenz Albe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dominique Devienne 2023-02-14 12:06:18 Re: Losing my latin on Ordering...
Previous Message Alvaro Herrera 2023-02-14 11:35:16 Re: Losing my latin on Ordering...