Re: Why do indexes and sorts use the database collation?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why do indexes and sorts use the database collation?
Date: 2023-11-13 23:38:05
Message-ID: 20231113233805.rtwnwzlyvldo3bls@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-11-14 00:02:13 +0100, Tomas Vondra wrote:
> On 11/13/23 23:12, Andres Freund wrote:
> > On 2023-11-13 22:36:24 +0100, Tomas Vondra wrote:
> >> ISTM it's about how complex the rules implemented by the collation are,
> >> so I agree the cost should be a feature of collations not providers.
> >
> > I'm not sure analysing the complexity in detail is worth it. ISTM there's a
> > few "levels" of costliness:
> >
> > 1) memcmp() suffices
> > 2) can safely use strxfrm() (i.e. ICU), possibly limited to when we sort
> > 3) deterministic collations
> > 4) non-deterministic collations
> >
> > I'm sure there are graduations, particularly within 3), but I'm not sure it's
> > realistic / worthwhile to go to that detail. I think a cost model like the
> > above would provide enough detail to make better decisions than today...
> >
>
> I'm not saying we have to analyze the complexity of the rules. I was
> simply agreeing with you that the "cost" should be associated with
> individual collations, not the providers.

Just to be clear, I didn't intend to contradict you or anything - I was just
outlining my initial thoughts of how we could model the costs.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-13 23:41:44 Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
Previous Message Andres Freund 2023-11-13 23:35:28 Re: archive modules loose ends