Quick Links

Re: encoding affects ICU regex character classification

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Jeremy Schneider <schneider(at)ardentperf(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: encoding affects ICU regex character classification
Date:	2023-12-18 20:39:05
Message-ID:	3a86ea75efc0a7dd1b040d3358356c901a9c154a.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2023-12-15 at 16:48 -0800, Jeremy Schneider wrote:
> This goes back to my other thread (which sadly got very little
> discussion): PosgreSQL really needs to be safe by /default/

Doesn't a built-in provider help create a safer option?

The built-in provider's version of Unicode will be consistent with
unicode_assigned(), which is a first step toward rejecting code points
that the provider doesn't understand. And by rejecting unassigned code
points, we get all kinds of Unicode compatibility guarantees that avoid
the kinds of change risks that you are worried about.

Regards,
Jeff Davis

In response to

Re: encoding affects ICU regex character classification at 2023-12-16 00:48:23 from Jeremy Schneider

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2023-12-18 21:00:30	Re: index prefetching
Previous Message	Daniel Verite	2023-12-18 20:35:53	Fixing backslash dot for COPY FROM...CSV