| From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
|---|---|
| To: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Rework of collation code, extensibility |
| Date: | 2023-02-23 23:59:04 |
| Message-ID: | 858d8908e1123ffafb8176ece9aadc3888f5f2c2.camel@j-davis.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, 2023-02-22 at 20:49 +0100, Peter Eisentraut wrote:
> > On 14.02.23 00:45, Jeff Davis wrote:
> > > > Now the patches are:
> > > >
> > > > 0001: pg_strcoll/pg_strxfrm
> > > > 0002: pg_locale_deterministic()
> > > > 0003: cleanup a USE_ICU special case
> > > > 0004: GUCs (only for testing, not for commit)
> >
> > I haven't read the whole thing again, but this arrangement looks
> > good > to
> > me. I don't have an opinion on whether 0004 is actually useful.
Committed with a few revisions after I took a fresh look over the
patch.
The most significant was that I found out that we are also hashing the
NUL byte at the end of the string when the collation is non-
deterministic. The refactoring patch doesn't change that of course, but
the API from pg_strnxfrm() is more clear and I added comments.
Also, ICU uses int32_t for string lengths rather than size_t (I'm not
sure that's a great idea, but that's what ICU does). I clarified the
boundary by changing the argument types of the ICU-specific static
functions to int32_t, while leaving the API entry points as size_t.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Justin Pryzby | 2023-02-24 00:20:29 | Re: pgsql: Refactor to add pg_strcoll(), pg_strxfrm(), and variants. |
| Previous Message | Nathan Bossart | 2023-02-23 23:16:50 | Re: Weird failure with latches in curculio on v15 |