Re: How to display complicated Chinese character: Biang.

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>
Subject: Re: How to display complicated Chinese character: Biang.
Date: 2022-06-02 11:03:58
Message-ID: 55a71975b11634d6554559064f24931bb25408b9.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 2022-06-02 at 12:45 +0530, jian he wrote:
> Trying to display some special Chinese characters in Postgresql.
>
> localhost:5433 admin(at)test=# show LC_COLLATE;
> +------------+
> | lc_collate |
> +------------+
> | C.UTF-8    |
> +------------+
>
> > with strings(s) as (
> >  values (U&'\+0030EDD')
> > )
> > select s,
> >   octet_length(s),
> >   char_length(s),
> >   (select count(*) from icu_character_boundaries(s,'en')) as graphemes from strings;
> >
>
> +-----+--------------+-------------+-----------+
> |  s    | octet_length | char_length | graphemes |
> +-----+--------------+-------------+-----------+
> | ロD |            4      |           2          |         2 |
> +-----+--------------+-------------+-----------+
>
> Seems not right. graphemes should be 1?

You have an extra "0" there; "\+" unicode escapes have exactly 6 digits:

WITH strings(s) AS (
VALUES (U&'\+030EDD')
)
select s,
octet_length(s),
char_length(s)
from strings;

s │ octet_length │ char_length
════╪══════════════╪═════════════
𰻝 │ 4 │ 1
(1 row)

PostgreSQL doesn't have a function "icu_character_boundaries".

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message operations i 2022-06-02 11:12:18 Re: How is this possible "publication does not exist"
Previous Message operations i 2022-06-02 11:01:03 Re: How is this possible "publication does not exist"