From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Built-in CTYPE provider |
Date: | 2023-12-20 19:24:55 |
Message-ID: | CA+TgmoYCNEm4tG0xgszXpyaM_-fu3onmHf2TDM1j5ru8ZmiCcQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 20, 2023 at 2:13 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Wed, 2023-12-20 at 13:49 +0100, Daniel Verite wrote:
> > If the Postgres default was bytewise sorting+locale-agnostic
> > ctype functions directly derived from Unicode data files,
> > as opposed to libc/$LANG at initdb time, the main
> > annoyance would be that "ORDER BY textcol" would no
> > longer be the human-favored sort.
> > For the presentation layer, we would have to write for instance
> > ORDER BY textcol COLLATE "unicode" for the root collation
> > or a specific region-country if needed.
> > But all the rest seems better, especially cross-OS compatibity,
> > truly immutable and faster indexes for fields that
> > don't require linguistic ordering, alignment between Unicode
> > updates and Postgres updates.
>
> Thank you, that summarizes exactly the compromise that I'm trying to
> reach.
This makes sense to me, too, but it feels like it might work out
better for speakers of English than for speakers of other languages.
Right now, I tend to get databases that default to en_US.utf8, and if
the default changed to C.utf8, then the case-comparison behavior might
be different but the letters would still sort in the right order. For
someone who is currently defaulting to es_ES.utf8 or fr_FR.utf8, a
change to C.utf8 would be a much bigger problem, I would think. Their
alphabet isn't in code point order, and so things would be
alphabetized wrongly. That might be OK if they don't care about
ordering for any purpose other than equality lookups, but otherwise
it's going to force them to change the default, where today they don't
have to do that.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Burroughs | 2023-12-20 19:39:31 | Re: libpq compression (part 3) |
Previous Message | Jeff Davis | 2023-12-20 19:13:12 | Re: Built-in CTYPE provider |