From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> |
Subject: | Re: Support LIKE with nondeterministic collations |
Date: | 2024-11-20 07:29:22 |
Message-ID: | CACJufxGuBNQzx1LFBpEP01A1SuSndfzMWXHR9vr9bV3A6dB84g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 19, 2024 at 9:51 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> On 18.11.24 04:30, jian he wrote:
> > we can optimize when trailing (last character) is not wildcards.
> >
> > SELECT 'Ha12foo' LIKE '%foo' COLLATE ignore_accents;
> > within the for loop
> > for(;;)
> > {
> > int cmp;
> > CHECK_FOR_INTERRUPTS();
> > ....
> > }
> >
> > pg_strncoll comparison will become
> > Ha12foo foo
> > a12foo foo
> > 12foo foo
> > 2foo foo
> > foo foo
> >
> > it's safe because in MatchText we have:
> > else if (*p == '%')
> > {
> > while (tlen > 0)
> > {
> > if (GETCHAR(*t, locale) == firstpat || (locale && !locale->deterministic))
> > {
> > int matched = MatchText(t, tlen, p, plen, locale);
> > if (matched != LIKE_FALSE)
> > return matched; /* TRUE or ABORT */
> > }
> > NextChar(t, tlen);
> > }
> > }
> >
> > please check attached.
>
> I see, good idea. I implemented it a bit differently. See "Shortcut:
> If this is the end of the pattern ..." in this patch. Please check if
> this is what you had in mind.
your implementation is far more simpler than mine.
I think I understand it.
i am trying to optimize case where pattern is begin_with like `pattern%`
but failed on case like:
SELECT U&'\0061\0308bc' LIKE U&'\00E4bc%' COLLATE ignore_accents;
basically the_string like the_pattern%. the length of the_string and
length of the_pattern
can vary, we can not just do one pg_strncoll.
in match_pattern_prefix maybe change
if (expr_coll && !get_collation_isdeterministic(expr_coll))
return NIL;
to
if (OidIsValid(expr_coll) && !get_collation_isdeterministic(expr_coll))
return NIL;
other than that, I didn't find any issue.
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Sitnikov | 2024-11-20 07:48:22 | pg_prepared_xacts returns transactions that are foreign to the caller |
Previous Message | Pavel Stehule | 2024-11-20 07:25:53 | Re: proposal: schema variables |