Re: Order changes in PG16 since ICU introduction

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Sandro Santilli <strk(at)kbt(dot)io>, Regina Obe <lr(at)pcorp(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Order changes in PG16 since ICU introduction
Date: 2023-05-15 21:03:23
Message-ID: 91c0cdd2a402d59f8d6eaac40862d6f68e0b937c.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2023-05-13 at 13:00 +0300, Alexander Lakhin wrote:
> On the current master (after 455f948b0, and before f7faa9976, of
> course)
> I get an ASAN-detected failure with the following query:
> CREATE COLLATION col (provider = icu, locale = '123456789012');
>

Thank you for the report!

ICU source specifically says:

/**
* Useful constant for the maximum size of the language
part of a locale ID.
* (including the terminating NULL).
* @stable ICU 2.0
*/
#define ULOC_LANG_CAPACITY 12

So the fact that it returning success without nul-terminating the
result is an ICU bug in my opinion, and I reported it here:

https://unicode-org.atlassian.net/browse/ICU-22394

Unfortunately that means we need to be a bit more paranoid and always
check for that warning, even if we have a preallocated buffer of the
"correct" size. It also means that both U_STRING_NOT_TERMINATED_WARNING
and U_BUFFER_OVERFLOW_ERROR will be user-facing errors (potentially
scary), unless we check for those errors each time and report specific
errors for them.

Another approach is to always check the length and loop using dynamic
allocation so that we never overflow (and forget about any static
buffers). That seems like overkill given that the problem case is
obviously invalid input; I think as long as we catch it and throw an
ERROR it's fine. But I can do this if others think it's worthwhile.

Patch attached. It just checks for the U_STRING_NOT_TERMINATED_WARNING
in a few places and turns it into an ERROR. It also cleans up the loop
for uloc_getLanguageTag() to check for U_STRING_NOT_TERMINATED_WARNING
rather than (U_SUCCESS(status) && len >= buflen).

--
Jeff Davis
PostgreSQL Contributor Team - AWS

Attachment Content-Type Size
0001-ICU-check-for-U_STRING_NOT_TERMINATED_WARNING.patch text/x-patch 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-05-15 21:16:31 Re: Order changes in PG16 since ICU introduction
Previous Message Marina Polyakova 2023-05-15 20:23:18 Re: Conflict between regression tests namespace & transactions due to recent changes