pgsql: Support text position search functions with nondeterministic col

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Support text position search functions with nondeterministic col
Date: 2025-02-21 11:30:24
Message-ID: E1tlREg-000SRA-0A@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Support text position search functions with nondeterministic collations

This allows using text position search functions with nondeterministic
collations. These functions are

- position, strpos
- replace
- split_part
- string_to_array
- string_to_table

which all use common internal infrastructure.

There was previously no internal implementation of this, so it was met
with a not-supported error. This adds the internal implementation and
removes the error.

Unlike with deterministic collations, the search cannot use any
byte-by-byte optimized techniques but has to go substring by
substring. We also need to consider that the found match could have a
different length than the needle and that there could be substrings of
different length matching at a position. In most cases, we need to
find the longest such substring (greedy semantics), but this can be
configured by each caller.

Reviewed-by: Euler Taveira <euler(at)eulerto(dot)com>
Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400(at)eisentraut(dot)org

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/329304c9012b2ac6d906afeb18062f9080dceef9

Modified Files
--------------
src/backend/utils/adt/varlena.c | 104 ++++++++++++++---
src/test/regress/expected/collate.icu.utf8.out | 154 ++++++++++++++++++++-----
src/test/regress/sql/collate.icu.utf8.sql | 36 +++++-
3 files changed, 246 insertions(+), 48 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2025-02-21 11:38:16 pgsql: Fix cross-version upgrades with XMLSERIALIZE(NO INDENT)
Previous Message Daniel Gustafsson 2025-02-21 10:29:46 pgsql: doc: Add links to olsen93 and ong90 in bibliography