Quick Links

Re: About Unicode IVS

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Holger Jakobs <holger(at)jakobs(dot)com>
Cc:	pgsql-admin(at)lists(dot)postgresql(dot)org, n2029(at)ndensan(dot)co(dot)jp
Subject:	Re: About Unicode IVS
Date:	2022-03-29 10:25:56
Message-ID:	1107956.1648549556@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

Holger Jakobs <holger(at)jakobs(dot)com> writes:
> It's totally correct that the two characters are still two characters.
> You would have to normalize the string first, so that the combination
> becomes one character.

Yeah. In principle the normalize() function ought to do this for
you. But it doesn't seem to shorten the given example for me;
I'm not sure if that means the example is incorrect, or if it's
a bug in normalize().

u8=# select octet_length(U&'\+008FBA' || U&'\+0E0102');
octet_length
--------------
7
(1 row)

u8=# select octet_length(normalize(U&'\+008FBA' || U&'\+0E0102'));
octet_length
--------------
7
(1 row)

regards, tom lane

In response to

Re: About Unicode IVS at 2022-03-29 10:02:18 from Holger Jakobs

Responses

RE: About Unicode IVS at 2022-03-29 11:03:45 from 荒井元成

Browse pgsql-admin by date

	From	Date	Subject
Next Message	荒井元成	2022-03-29 11:03:45	RE: About Unicode IVS
Previous Message	Holger Jakobs	2022-03-29 10:02:18	Re: About Unicode IVS