Re: ts_parse reports different between MacOS, FreeBSD/Linux

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mark Felder" <feld(at)FreeBSD(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ts_parse reports different between MacOS, FreeBSD/Linux
Date: 2020-12-22 18:46:44
Message-ID: 625396.1608662804@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Mark Felder" <feld(at)FreeBSD(dot)org> writes:
> We have an application whose test suite fails on MacOS when running the search tests on unicode characters.

Yeah, known problem :-(. The text search parser relies on the C library's
locale data to classify characters as being letters, digits, etc.
Unfortunately, the UTF8 locales on macOS are just horribly bad, and
report many results that are different from other platforms.

I suppose that Apple has got reasonable Unicode character knowledge
somewhere in their OS; they are just not very interested in making the
POSIX locale APIs work well. Which leaves us with a bit of a problem
for getting consistent results cross-platform.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2020-12-23 00:07:44 Information schema sql_identifier
Previous Message Adrian Klaver 2020-12-22 18:15:59 Re: Missing rows after migrating from postgres 11 to 12 with logical replication