From: | Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Роман Литовченко <roman(dot)lytovchenko(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows |
Date: | 2020-09-03 08:57:27 |
Message-ID: | 20200903105727.064665ce@firost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, 3 Sep 2020 10:26:03 +0200
Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 2020-09-03 09:41, Daniel Verite wrote:
> > Jehan-Guillaume de Rorthais wrote:
> >
> >> Maybe Daniel has some more experience feedback with other customizations
> >
> > No, I've just tried various other reorderings, and didn't find any other
> > that seems to have the same bug as latn-digit.
> > My tests consisted of indexing a large corpus of text and running the
> > index through amcheck.
>
> In this case I'm tempted to just leave it alone and write it off as a
> bug in ICU. We could potentially inspect the collator object at CREATE
> COLLATION time and issues warnings if we find something we know to be buggy.
>
> I don't think we want to make our code uglier and slower
It's not that uglier, only slower. And maybe we could wrap the logic inside
some dedicated func/macro checking for versions, etc.
> for normal uses to work around a bug in some niche feature in ICU.
Well, indeed, I was wondering in another thread if we should fix it or
document it.
However, raising some WARNING doesn't seem enough as we would effectively leave
the user create a buggy collation and maybe corrupted index on top of it. *If*
we choose this way, I would vote for an ERROR.
However, as I wrote earlier, we have no hard evidence latn-digit is the only
buggy customization with ICU. Even if there is very little probability, we
might have to pile up some more tests about versions, customization, etc. As
instance, we would have to exclude latn-digit, but not latn-digit-kn, for
some ICU versions, etc, etc... until proven otherwise. Code maintenance for
each new version of ICU might become boring.
But maybe I am being silly while planing on some unknown things and ICU is only
affected for latn-digit?
I really have no strong feeling right now about the best solution to adopt.
However, I feel the least to do would be document it somewhere with a lot of
strong emphasis.
Regards,
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2020-09-03 09:03:24 | Re: Download page of Postgres not working |
Previous Message | Daniel Gustafsson | 2020-09-03 08:54:58 | Re: Download page of Postgres not working |