From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | Euler Taveira <euler(at)eulerto(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Support POSITION with nondeterministic collations |
Date: | 2025-02-21 11:48:21 |
Message-ID: | 6107daa2-5cf7-4cf2-a526-626be1d15b18@eisentraut.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12.02.25 05:31, Euler Taveira wrote:
> On Mon, Dec 2, 2024, at 6:09 AM, Peter Eisentraut wrote:
>> On 26.08.24 08:09, Peter Eisentraut wrote:
>> > This patch allows using text position search functions with
>> > nondeterministic collations. These functions are
>> >
>> > - position, strpos
>> > - replace
>> > - split_part
>> > - string_to_array
>> > - string_to_table
>> >
>> > which all use common internal infrastructure.
>>
>> > Some exploratory testing could be useful here. The present test
>> > coverage was already quite helpful during development, but there is
>> > always the possibility that something was overlooked.
>
> I took a look at this patch.
>
> * Most callers will require "greedy" semantics, meaning that
> we need
> * to find the longest such substring, not the shortest. For
> callers
> * don't don't need greedy semantics, we can finish on the first
>
> s/don't don't/that don't/ ?
fixed
> - Assert(len1 > 0);
> Assert(len2 > 0);
>
> Is there a reason to remove this assert?
len1 is the length of the "haystack". The previous code did not call
text_position_setup() with an empty haystack, due to early exits at
/* Empty needle always matches at position 1 */
and
/* Otherwise, can't match if haystack is shorter than needle */
but the second one does not apply for nondeterministic collations, so we
have to deal with with zero-length haystacks.
> * (With nondeterministic collations, the search was already
> * multibyte-aware, so we don't need this.)
>
> s/was/is/
fixed
> The commit title could be changed to reflect that you are adding support for
> multiple functions. The POSITION gives the impression that it is only
> for the
> position() function. Something like
>
> Support position search functions with nondeterministic collations
fixed
> I did a couple of tests (some are shown below) and I didn't find issues.
Thanks for the extra tests. I have committed this with the adjustments
mentioned above.
From | Date | Subject | |
---|---|---|---|
Next Message | Rucha Kulkarni | 2025-02-21 11:50:15 | Doubts regarding pg_freespacemap extension |
Previous Message | Michael Paquier | 2025-02-21 11:46:03 | Re: Missing [NO] INDENT flag in XMLSerialize backward parsing |