Re: Built-in CTYPE provider

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Peter Eisentraut" <peter(at)eisentraut(dot)org>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Built-in CTYPE provider
Date: 2024-01-18 19:42:10
Message-ID: f5c4d836-80e5-4978-ad40-c3749957d9e8@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut wrote:

> > If the Postgres default was bytewise sorting+locale-agnostic
> > ctype functions directly derived from Unicode data files,
> > as opposed to libc/$LANG at initdb time, the main
> > annoyance would be that "ORDER BY textcol" would no
> > longer be the human-favored sort.
>
> I think that would be a terrible direction to take, because it would
> regress the default sort order from "correct" to "useless". Aside from
> the overall message this sends about how PostgreSQL cares about
> locales and Unicode and such.

Well, offering a viable solution to avoid as much as possible
the dreaded:

"WARNING: collation "xyz" has version mismatch
... HINT: Rebuild all objects affected by this collation..."

that doesn't sound like a bad message to send.

Currently, to have in codepoint order the indexes that don't need a
linguistic order, you're supposed to use collate "C", which then means
that upper(), lower() etc.. don't work beyond ASCII.
Here our Unicode support is not good enough, and the proposal
addresses that.

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-01-18 20:20:38 Re: Emit fewer vacuum records by reaping removable tuples during pruning
Previous Message Robert Haas 2024-01-18 19:00:58 Re: the s_lock_stuck on perform_spin_delay