From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Matthias Apitz <guru(at)unixarea(dot)de> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: sort order for UTF-8 char column with Japanese UTF-8 |
Date: | 2022-02-03 21:50:48 |
Message-ID: | CA+hUKGLR86ZK8dq0onE4ExMvtVU9w41ZpUsBjVxoddWzO1b0NA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Feb 4, 2022 at 8:11 AM Matthias Apitz <guru(at)unixarea(dot)de> wrote:
> On my FreeBSD laptop the same file sorts as
>
> guru(at)c720-r368166:~ $ LANG=de_DE.UTF-8 sort swd
> A
> ゲアハルト・A・リッター
> ゲルハルト・A・リッター
> チャールズ・A・ビアード
> A010STRUKTUR
> A010STRUKTUR
> A010STRUKTUR
> A0150SUPRALEITER
Wow, so it's one thing to have a different default "script order" than
glibc and ICU (which is something you can customise IIRC), but isn't
something broken here if the Japanese text comes between "A" and
"A0..."?? Hmm, it's almost as if it completely ignored the Japanese
text. From my FreeBSD box:
tmunro=> select * from t order by x collate "de_DE.UTF-8";
x
--------------------------
ゲアハルト
A
ゲアハルト・A・リッター
A0
A010STRUKTUR
AA
ゲアハルト・AA・リッター
ゲアハルト・B・リッター
(8 rows)
tmunro=> select * from t order by x collate "ja_JP.UTF-8";
x
--------------------------
A
A0
A010STRUKTUR
AA
ゲアハルト
ゲアハルト・AA・リッター
ゲアハルト・A・リッター
ゲアハルト・B・リッター
(8 rows)
Seems like something to investigate in FreeBSD land.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Lewis | 2022-02-03 21:53:02 | Re: pg_cron for vacuum - dynamic table set |
Previous Message | David G. Johnston | 2022-02-03 21:48:26 | Re: pg_cron for vacuum - dynamic table set |