From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | jian he <jian(dot)universality(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> |
Subject: | Re: Support LIKE with nondeterministic collations |
Date: | 2024-11-15 15:42:31 |
Message-ID: | 07fdbb85-c530-48d0-adc0-7d43d7951e1b@eisentraut.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 15.11.24 05:26, jian he wrote:
> /*
> * Now build a substring of the text and try to match it against
> * the subpattern. t is the start of the text, t1 is one past the
> * last byte. We start with a zero-length string.
> */
> t1 = t
> t1len = tlen;
> for (;;)
> {
> int cmp;
> CHECK_FOR_INTERRUPTS();
> cmp = pg_strncoll(subpat, subpatlen, t, (t1 - t), locale);
>
> select '.foo.' LIKE '_oo' COLLATE ign_punct;
> pg_strncoll's iteration of the first 4 argument values.
> oo 2 foo. 0
> oo 2 foo. 1
> oo 2 foo. 2
> oo 2 foo. 3
> oo 2 foo. 4
>
> seems there is a shortcut/optimization.
> if subpat don't have wildcard(percent sign, underscore)
> then we can have less pg_strncoll calls?
How would you do that? You need to try all combinations to find one
that matches.
> minimum case to trigger error within GenericMatchText
> since no related tests.
> create table t1(a text collate case_insensitive, b text collate "C");
> insert into t1 values ('a','a');
> select a like b from t1;
This results in
ERROR: 42P22: could not determine which collation to use for LIKE
HINT: Use the COLLATE clause to set the collation explicitly.
which is the expected behavior.
> at 9.7.1. LIKE section, we still don't know what "wildcard" is.
> we mentioned it at 9.7.2.
> maybe we can add a sentence at the end of:
> <para>
> If <replaceable>pattern</replaceable> does not contain percent
> signs or underscores, then the pattern only represents the string
> itself; in that case <function>LIKE</function> acts like the
> equals operator. An underscore (<literal>_</literal>) in
> <replaceable>pattern</replaceable> stands for (matches) any single
> character; a percent sign (<literal>%</literal>) matches any sequence
> of zero or more characters.
> </para>
>
> saying underscore and percent sign are wildcards in LIKE.
> other than that, I can understand the doc.
Ok, I agree that could be clarified.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-11-15 16:09:58 | Re: Update Unicode data to Unicode 16.0.0 |
Previous Message | Tom Lane | 2024-11-15 15:09:54 | Re: Potential ABI breakage in upcoming minor releases |