From: | 荒井元成 <n2029(at)ndensan(dot)co(dot)jp> |
---|---|
To: | "'Peter Eisentraut'" <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Cc: | <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Unicode Variation Selector and Combining character |
Date: | 2022-06-01 06:15:15 |
Message-ID: | 011f01d8757e$f5d69700$e183c500$@ndensan.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for your reply.
We use IPAmj Mincho Font in the specifications of the Government of Japan.
https://moji.or.jp/mojikiban/font/
Exsample)IVS
I will attach an image.
D209007=# select char_length(U&'\+0066FE' || U&'\+0E0103') ;
char_length
-------------
2
(1 行)
I expect length 1.
Exsample)Combining Character
I will attach an image.
D209007=# select char_length(U&'\+00304B' || U&'\+00309A') ;
char_length
-------------
2
(1 行)
I expect length 1.
thank you.
-----Original Message-----
From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Sent: Wednesday, June 1, 2022 2:27 PM
To: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp>; pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Unicode Variation Selector and Combining character
On 30.05.22 02:27, 荒井元成 wrote:
> I tried it on PostgreSQL 13. If you use the Unicode Variation Selector
> and Combining Character
>
> , the base character and the Variation selector will be 2 in length.
> Since it will be one character on the display, we expect it to be one
> in length. Please provide a function corresponding to the unicode
> variasion selector. I hope It is supposed to be provided as an extension.
>
> The functions that need to be supported are as follows:
>
> char_length|character_length|substring|trim|btrim|left
>
> |length|lpad|ltrim|regexp_match|regexp_matches
>
> |regexp_replace|regexp_split_to_array|regexp_split_to_table
>
> |replace|reverse|right|rpad|rtrim|split_part|strpos|substr|starts_with
Please show a test case of what you mean. For example,
select char_length(...) returns X but should return Y
Examples with Unicode escapes (U&'\NNNN...') would be the most robust.
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2022-06-01 07:09:23 | Re: Unicode Variation Selector and Combining character |
Previous Message | Michael Paquier | 2022-06-01 05:29:09 | Re: Prevent writes on large objects in read-only transactions |