From: | Graham Myers <gmyers(at)retailexpress(dot)com> |
---|---|
To: | ่ไบๅ ๆ <n2029(at)ndensan(dot)co(dot)jp>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | RE: About Unicode IVS |
Date: | 2022-03-29 08:26:09 |
Message-ID: | d60efdf8caa7379a7483cd530ba5098e@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Thanks you for the explanation, Unicode always blows my mind ๐ The
problems is that postgres is counting code points which in your example is
two.
*From:* ่ไบๅ
ๆ <n2029(at)ndensan(dot)co(dot)jp>
*Sent:* 29 March 2022 09:21
*To:* 'Graham Myers' <gmyers(at)retailexpress(dot)com>; 'David G. Johnston' <
david(dot)g(dot)johnston(at)gmail(dot)com>
*Cc:* pgsql-admin(at)lists(dot)postgresql(dot)org
*Subject:* RE: About Unicode IVS
thank you for your reply.
This is because two characters display one character.
This includes Unicode Variant Selectors and Combining Characters.
Moto.
*From:* Graham Myers <gmyers(at)retailexpress(dot)com>
*Sent:* Tuesday, March 29, 2022 4:46 PM
*To:* ่ไบๅ
ๆ <n2029(at)ndensan(dot)co(dot)jp>; David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com>
*Cc:* pgsql-admin(at)lists(dot)postgresql(dot)org
*Subject:* RE: About Unicode IVS
Why do you expect the concatenation of two characters to return a length of
one?
Graham Myersโ
*From:* ่ไบๅ
ๆ <n2029(at)ndensan(dot)co(dot)jp>
*Sent:* 29 March 2022 05:35
*To:* 'David G. Johnston' <david(dot)g(dot)johnston(at)gmail(dot)com>
*Cc:* pgsql-admin(at)lists(dot)postgresql(dot)org
*Subject:* RE: About Unicode IVS
thank you for your reply.
It will be 2 characters.
select char_length(U&'\+008FBA' || U&'\+0E0102');
char_length
-------------
2
(1 ่ก)
select length('่พบ๓ ');
length
--------
2
(1 ่ก)
select char_length('่พบ๓ ');
char_length
-------------
2
(1 ่ก)
$ psql -l
ใใผใฟใใผในไธ่ฆง
ๅๅ | ๆๆ่ | ใจใณใณใผใใฃใณใฐ | ็ งๅ้ ๅบ | Ctype(ๅคๆๆผ็ฎๅญ) | ใขใฏใปในๆจฉ้
-----------+---------+------------------+----------+-------------------+---------------------
D209007 | D209007 | UTF8 | C | C |
postgres | D209007 | UTF8 | C | C |
template0 | D209007 | UTF8 | C | C |
=c/D209007 +
| | | | |
D209007=CTc/D209007
template1 | D209007 | UTF8 | C | C |
=c/D209007 +
| | | | |
D209007=CTc/D209007
(4 ่ก)
$ cat pgdata/PG_VERSION
13
Moto.
*From:* David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
*Sent:* Tuesday, March 29, 2022 12:38 PM
*To:* ่ไบๅ
ๆ <n2029(at)ndensan(dot)co(dot)jp>
*Cc:* pgsql-admin(at)lists(dot)postgresql(dot)org
*Subject:* Re: About Unicode IVS
Graham Myers
On Monday, March 28, 2022, ่ไบๅ ๆ <n2029(at)ndensan(dot)co(dot)jp> wrote:
Hi,
In the Length () function, it will be 2 characters where you want it to be
1 character.
Is it possible to respond by changing the settings such as changing
the collation setting like SQL Server?
Also, if you understand how to deal with it (eg, create your own
function), it would be helpful if you could provide as much
information as you can.
Try char_length(text) instead.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | ่ไบๅ ๆ | 2022-03-29 08:52:45 | RE: About Unicode IVS |
Previous Message | ่ไบๅ ๆ | 2022-03-29 08:21:18 | RE: About Unicode IVS |