Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Date: 2017-08-09 17:53:06
Message-ID: CAH2-WznjeN2Wb7q5Gkbbiz-gfG7HB=igNsakCPafVKiamckYiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Aug 9, 2017 at 10:35 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> In other words, excluding, say, emoji collations from what gets
> imported is just making a value judgement that those collations aren't
> important and people shouldn't want to use them.

Yes, it is. I think that's fine, though. Other database systems that
use ICU for collations do this. Without exception, I think.

> It's saying that we
> know better than the ICU maintainers which collations ought to exist.

As I've pointed out already, we populate pg_collation by asking
ucol_getKeywordValuesForLocale() to get only "commonly used [variant]
values with the given locale" within pg_import_system_collations().
That's a value judgement. This doesn't have documented stability
guarantees. I'm not sure how many technically distinct collations we
could generate by actually including every possible variant, but it
might be a great deal more than we get already.

Why would users have the same upgrade problem when they created these
collations manually? The BCP 47 language tag format seems to be very
much centered around "doing its best", even though that could be
pretty far from the desired behavior [1]. This makes a certain amount
of sense, considering that it's well documented, and can be considered
a stable API. It really shouldn't break, but if it does then I suppose
it's probably because the behavior doesn't have an analog in the ICU
version in use.

[1] postgr.es/m/CAH2-Wzm22vtxvD-e1oz90DE8Z_M61_8amHsDOZf1PWRKfRmj1g@mail.gmail.com
--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-08-09 18:07:48 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Robert Haas 2017-08-09 17:35:42 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-09 18:07:48 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Robert Haas 2017-08-09 17:35:42 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values