Re: Collation & ctype method table, and extension hooks

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Collation & ctype method table, and extension hooks
Date: 2024-10-25 22:42:36
Message-ID: dfff1ca84dd0e8ee03aa2478412134895b0b9681.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2024-10-24 at 10:05 +0200, Andreas Karlsson wrote:
> Why is there no pg_locale_builtin.c?

Just that it would be a fairly small file, but I'm fine with doing
that.

> I think adding an assert to create_pg_locale() which enforces valid
> there is always a combination of ctype_is_c and casemap would be
> good,
> similar to the collate field.

Good idea.

> Why are casemap and ctype_methods not the same struct? They seem very
> closely related.

The code impact was in fairly different places, so it seemed like a
nice way to break it out. I could combine them, but it would be a
fairly large patch.

> This commit makes me tempted to handle the ctype_is_c logic for
> character classes also in callbacks and remove the if in functions
> like
> pg_wc_ispunct(). But this si something that would need to be
> benchmarked.

That's a good idea. The reason collate_is_c is important is because
there are quite a few caller-specific optimizations, but that doesn't
seem to be true of ctype_is_c.

> I wonder if the bitmask idea isn't terrible for the branch predictor
> and
> that me may want one function per character class, but this is yet
> again
> something we need to benchmark.

Agreed -- a lot of work has gone into optimizing the regex code, and we
don't want a perf regression there. But I'm also not sure exactly which
kinds of tests I should be running for that.

> Is there a reason we allocate the icu_provider in
> create_pg_locale_icu
> with MemoryContextAllocZero when we intialize everything anyway? And
> similar for other providers.

Allocating and zeroing is a good defense against new optional methods
and fields which can safely default to zero.

> = v6-0011-Introduce-hooks-for-creating-custom-pg_locale_t.patch
>
> Looks good but seems like a quite painful API to use.

How is it painful and can we make it better?

> > * Have a CREATE LOCALE PROVIDER command and make "provider" an Oid
> > rather than a char ('b'/'i'/'c'). The v6 patches brings us close to
> > this point, but I'm not sure if we want to go this far in v18.
>
> Probably necessary but I hate all the DDL commands the way to SQL
> standard is written forces us to add.

There is some precedent for a DDL-like thing without new grammar:
pg_replication_origin_create(). I don't have a strong opinion on
whether to do that or not.

>
Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-10-26 00:51:09 Re: New function normal_rand_array function to contrib/tablefunc.
Previous Message Jeff Davis 2024-10-25 22:18:48 Re: Statistics Import and Export