| From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
|---|---|
| To: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Collation version tracking for macOS |
| Date: | 2022-06-03 19:23:05 |
| Message-ID: | CA+hUKGLAAd+ftrUxJUV=Grgmf7gk0oVd_msG3oty2ufv-HVx+g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sat, Jun 4, 2022 at 12:17 AM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> On 07.05.22 02:31, Thomas Munro wrote:
> > Last time I looked into this it seemed like macOS's strcoll() gave
> > sensible answers in the traditional single-byte encodings, but didn't
> > understand UTF-8 at all so you get C/strcmp() order. In other words
> > there was effectively nothing to version.
>
> Someone recently told me that collations in macOS have actually changed
> recently and that this is a live problem. See explanation here:
>
> https://github.com/PostgresApp/PostgresApp/blob/master/docs/documentation/reindex-warning.md?plain=1#L66
How can I see evidence of this? I'm comparing Debian, FreeBSD and
macOS 12.4 and when I run "LC_COLLATE=en_US.UTF-8 sort
/usr/share/dict/words" I get upper and lower case mixed together on
the other OSes, but on the Mac the upper case comes first, which is my
usual smoke test for "am I looking at binary sort order?"
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2022-06-03 19:53:18 | Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set |
| Previous Message | Jeremy Schneider | 2022-06-03 19:13:33 | Re: Collation version tracking for macOS |