RE: About Unicode IVS

From: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp>
To: "'Graham Myers'" <gmyers(at)retailexpress(dot)com>, "'David G(dot) Johnston'" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: RE: About Unicode IVS
Date: 2022-03-29 08:52:45
Message-ID: 012001d8434a$5bf28150$13d783f0$@ndensan.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Where should I make a request if I want Postgresql to handle it?

Is this mailing list all right?

Moto.

From: Graham Myers <gmyers(at)retailexpress(dot)com>
Sent: Tuesday, March 29, 2022 5:26 PM
To: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp>; David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: RE: About Unicode IVS

Thanks you for the explanation, Unicode always blows my mind 😊 The problems is that postgres is counting code points which in your example is two.

Graham Myers​

From: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp <mailto:n2029(at)ndensan(dot)co(dot)jp> >
Sent: 29 March 2022 09:21
To: 'Graham Myers' <gmyers(at)retailexpress(dot)com <mailto:gmyers(at)retailexpress(dot)com> >; 'David G. Johnston' <david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com> >
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org <mailto:pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: RE: About Unicode IVS

thank you for your reply.

This is because two characters display one character.

This includes Unicode Variant Selectors and Combining Characters.

Moto.

From: Graham Myers <gmyers(at)retailexpress(dot)com <mailto:gmyers(at)retailexpress(dot)com> >
Sent: Tuesday, March 29, 2022 4:46 PM
To: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp <mailto:n2029(at)ndensan(dot)co(dot)jp> >; David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com> >
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org <mailto:pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: RE: About Unicode IVS

Why do you expect the concatenation of two characters to return a length of one?

Graham Myers​

From: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp <mailto:n2029(at)ndensan(dot)co(dot)jp> >
Sent: 29 March 2022 05:35
To: 'David G. Johnston' <david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com> >
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org <mailto:pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: RE: About Unicode IVS

thank you for your reply.

It will be 2 characters.

select char_length(U&'\+008FBA' || U&'\+0E0102');

char_length

-------------

2

(1 行)

select length('辺󠄂');

length

--------

2

(1 行)

select char_length('辺󠄂');

char_length

-------------

2

(1 行)

$ psql -l

データベース一覧

名前 | 所有者 | エンコーディング | 照合順序 | Ctype(変換演算子) | アクセス権限

-----------+---------+------------------+----------+-------------------+---------------------

D209007 | D209007 | UTF8 | C | C |

postgres | D209007 | UTF8 | C | C |

template0 | D209007 | UTF8 | C | C | =c/D209007 +

| | | | | D209007=CTc/D209007

template1 | D209007 | UTF8 | C | C | =c/D209007 +

| | | | | D209007=CTc/D209007

(4 行)

$ cat pgdata/PG_VERSION

13

Moto.

From: David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com <mailto:david(dot)g(dot)johnston(at)gmail(dot)com> >
Sent: Tuesday, March 29, 2022 12:38 PM
To: 荒井元成 <n2029(at)ndensan(dot)co(dot)jp <mailto:n2029(at)ndensan(dot)co(dot)jp> >
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org <mailto:pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: About Unicode IVS

On Monday, March 28, 2022, 荒井元成 <n2029(at)ndensan(dot)co(dot)jp <mailto:n2029(at)ndensan(dot)co(dot)jp> > wrote:

Hi,

In the Length () function, it will be 2 characters where you want it to be 1 character.

Is it possible to respond by changing the settings such as changing the collation setting like SQL Server?
Also, if you understand how to deal with it (eg, create your own function), it would be helpful if you could provide as much information as you can.

Try char_length(text) instead.

David J.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Michel SALAIS 2022-03-29 09:34:40 RE: About Unicode IVS
Previous Message Graham Myers 2022-03-29 08:26:09 RE: About Unicode IVS