From: | John Naylor <john(dot)naylor(at)postgresql(dot)org> |
---|---|
To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | pgsql: Update display widths as part of updating Unicode |
Date: | 2021-08-26 15:06:00 |
Message-ID: | E1mJGx6-0002Xn-4v@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Update display widths as part of updating Unicode
The hardcoded "wide character" set in ucs_wcwidth() was last updated
around the Unicode 5.0 era. This led to misalignment when printing
emojis and other codepoints that have since been designated
wide or full-width.
To fix and keep up to date, extend update-unicode to download the list
of wide and full-width codepoints from the offical sources.
In passing, remove some comments about non-spacing characters that
haven't been accurate since we removed the former hardcoded logic.
Jacob Champion
Reported and reviewed by Pavel Stehule
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRCeX21O69YHxmykYySYyprZAqrKWWg0KoGKdjgqcGyygg(at)mail(dot)gmail(dot)com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/bab982161e0590746a2fd2a03043b27108b23ac6
Modified Files
--------------
src/common/unicode/.gitignore | 1 +
src/common/unicode/Makefile | 9 +-
.../generate-unicode_east_asian_fw_table.pl | 76 +++++++++++++
src/common/wchar.c | 41 +++----
src/include/common/unicode_east_asian_fw_table.h | 120 +++++++++++++++++++++
5 files changed, 220 insertions(+), 27 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2021-08-26 17:16:50 | pgsql: Extend collection of Unicode combining characters to beyond the |
Previous Message | John Naylor | 2021-08-26 14:07:43 | pgsql: Revert "Rename unicode_combining_table to unicode_width_table" |