From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: simplify regular expression locale global variables |
Date: | 2024-10-15 15:04:56 |
Message-ID: | 3965498.1729004696@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Peter Eisentraut <peter(at)eisentraut(dot)org> writes:
> but after the recent improvements to pg_locale_t handling, we don't need
> all three anymore. All the information we have is contained in
> pg_locale_t, so we just need to keep that one. This allows us to
> structure the locale-using regular expression code more similar to other
> locale-using code, mainly by provider, avoiding another layer that is
> specific only to the regular expression code. The first patch
> implements that.
I didn't read that patch in detail; somebody who's more familiar than
I with the recent locale-code changes ought to read it and confirm
that no subtle behavioral changes are sneaking in. But +1 for
concept.
> The second patch removes a call to pg_set_regex_collation() that I think
> is unnecessary.
I think this is actively wrong. pg_regprefix is engaged in
determining whether there's a fixed prefix of the regex, which
at least involves a sort of symbolic execution. As an example,
whether '^x' has a fixed prefix surely depends on whether the locale
is case-insensitive. (It may be that we get such cases wrong today,
since pg_regprefix was written before we had ICU locales and I don't
know if anyone has revisited it with this in mind. But removing this
pg_set_regex_collation call is surely not going to make that better.
In any case, the gain of removing it must be microscopic.)
> (I don't have any plans to get rid of the remaining global variable.
> That would certainly be nice from an intellectual point of view, but
> fiddling this into the regular expression code looks quite messy. In
> any case, it's probably easier with one variable instead of three, if
> someone wants to try.)
Yeah. Those global variables are my fault. I did try hard to avoid
having them, but came to the same conclusion that it was not worth
contorting the regex code to pass a locale pointer through it.
Maybe if we ever completely give up on maintaining code similarity
with the Tcl version, we should just bull ahead and do that; but for
now I don't want to.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-10-15 15:22:42 | Re: generic plans and "initial" pruning |
Previous Message | Tom Lane | 2024-10-15 14:48:19 | Re: simplify regular expression locale global variables |