| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Holger Jakobs <holger(at)jakobs(dot)com> |
| Cc: | pgsql-admin(at)lists(dot)postgresql(dot)org, n2029(at)ndensan(dot)co(dot)jp |
| Subject: | Re: About Unicode IVS |
| Date: | 2022-03-29 10:25:56 |
| Message-ID: | 1107956.1648549556@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-admin |
Holger Jakobs <holger(at)jakobs(dot)com> writes:
> It's totally correct that the two characters are still two characters.
> You would have to normalize the string first, so that the combination
> becomes one character.
Yeah. In principle the normalize() function ought to do this for
you. But it doesn't seem to shorten the given example for me;
I'm not sure if that means the example is incorrect, or if it's
a bug in normalize().
u8=# select octet_length(U&'\+008FBA' || U&'\+0E0102');
octet_length
--------------
7
(1 row)
u8=# select octet_length(normalize(U&'\+008FBA' || U&'\+0E0102'));
octet_length
--------------
7
(1 row)
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | 荒井元成 | 2022-03-29 11:03:45 | RE: About Unicode IVS |
| Previous Message | Holger Jakobs | 2022-03-29 10:02:18 | Re: About Unicode IVS |