From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Subject: | How to display complicated Chinese character: Biang. |
Date: | 2022-06-02 07:15:34 |
Message-ID: | CACJufxFcCqgSQNcwD2uy=NagmohN2yEeymDwoV-EA1=QDyDZqQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Inspired by this thread:
https://www.postgresql.org/message-id/011f01d8757e%24f5d69700%24e183c500%24%40ndensan.co.jp
Trying to display some special Chinese characters in Postgresql. For now I
am using postgresql 15 beta1. The OS is Ubuntu 20.
localhost:5433 admin(at)test=# show LC_COLLATE;
+------------+
| lc_collate |
+------------+
| C.UTF-8 |
+------------+
localhost:5433 admin(at)test=# select icu_unicode_version();
+---------------------+
| icu_unicode_version |
+---------------------+
| 13.0 |
+---------------------+
icu_unicode_version is the extension function.
Wiki about character Biang: https://en.wikipedia.org/wiki/Biangbiang_noodles
quote:
> The character's traditional and simplified forms were added to Unicode
> <https://en.wikipedia.org/wiki/Unicode> version 13.0 in March 2020 in the CJK
> Unified Ideographs Extension G
> <https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_Extension_G> block
> of the newly allocated Tertiary Ideographic Plane
> <https://en.wikipedia.org/wiki/Tertiary_Ideographic_Plane>.[19]
> <https://en.wikipedia.org/wiki/Biangbiang_noodles#cite_note-20> The
> corresponding Unicode characters are:
>
Unicode character info: https://www.compart.com/en/unicode/U+30EDD
query
with strings(s) as (
> values (U&'\+0030EDD')
> )
> select s,
> octet_length(s),
> char_length(s),
> (select count(*) from icu_character_boundaries(s,'en')) as graphemes
> from strings;
>
return
+-----+--------------+-------------+-----------+
| s | octet_length | char_length | graphemes |
+-----+--------------+-------------+-----------+
| ロD | 4 | 2 | 2 |
+-----+--------------+-------------+-----------+
Seems not right. graphemes should be 1?
And I am not sure values (U&'\+0030EDD') is the same as 𰻝.
--
I recommend David Deutsch's <<The Beginning of Infinity>>
Jian
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-06-02 09:51:17 | Re: How is this possible "publication does not exist" |
Previous Message | Tom Lane | 2022-06-02 04:32:40 | Re: unoptimized nested loops |