Re: Refactor: allow pg_strncoll(), etc., to accept -1 length for NUL-terminated cstrings.

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Refactor: allow pg_strncoll(), etc., to accept -1 length for NUL-terminated cstrings.
Date: 2024-09-21 00:28:51
Message-ID: 628ce15fc833a3da075a253b11b71f409675f3f9.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2024-08-22 at 11:00 -0700, Jeff Davis wrote:
> Like ICU, allow -1 length to mean that the input string is NUL-
> terminated for pg_strncoll(), pg_strnxfrm(), and
> pg_strnxfrm_prefix().

To better illustrate the direction I'm going, I roughly implemented
some patches that implement collation using a table of methods rather
than lots branching based on the provider.

This more cleanly separates the API for a provider, which will enable
us to use a hook to create a custom provider with arbitrary methods,
that may have nothing to do with ICU or libc. Or, we could go so far as
to implement a "CREATE LOCALE PROVIDER" that would provide the methods
using a handler function, and "datlocprovider" would be an OID rather
than a char.

From a practical perspective, I expect that extensions would use this
to lock down the version of a particular provider rather than implement
a completely arbitrary one. But the API is good for either case, and
offers quite a bit of code cleanup.

There are quite a few loose ends, of course:

* There is still a lot of branching on the provider for DDL and
catalog access. I'm not sure if we will ever eliminate all of this, or
if we would even want to.

* I haven't done anything with get_collation_actual_version().
Perhaps that should be a method, too, but it requires some extra
thought if we want this to be useful for "multilib" (having multiple
versions of a provider library at once).

* I didn't add methods for formatting.c yet.

* initdb -- should it offer a way to preload a library and then use
that for the provider?

* I need to allow an arbitrary per-provider context, rather than the
current union designed for the existing providers.

Again, the patches are rough and there's a lot of code churn. I'd like
some feedback on whether people generally like the direction this is
going. If so I will clean up the patch series into smaller, more
reviewable chunks.

Regards,
Jeff Davis

Attachment Content-Type Size
v4-0007-Use-method-table-for-collation.patch text/x-patch 97.3 KB
v4-0006-Allow-length-1-for-NUL-terminated-input-to-pg_str.patch text/x-patch 19.5 KB
v4-0005-invalidation.patch text/x-patch 2.7 KB
v4-0004-resource-owners.patch text/x-patch 5.1 KB
v4-0003-CollationCacheContext.patch text/x-patch 2.6 KB
v4-0002-create_pg_locale.patch text/x-patch 12.6 KB
v4-0001-Tighten-up-make_libc_collator-and-make_icu_collat.patch text/x-patch 8.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John H 2024-09-21 00:53:39 Re: Allow logical failover slots to wait on synchronous replication
Previous Message Jeff Davis 2024-09-21 00:28:48 Re: [18] Fix a few issues with the collation cache