From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Martin L(dot) Buchanan" <martinlbuchanan(at)gmail(dot)com>, pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION? |
Date: | 2023-02-09 02:06:34 |
Message-ID: | CAApHDvruPCe3Sz5M=y=zzgT71Xj8chwD-MvXTih9b934V8Wf1w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, 9 Feb 2023 at 14:49, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > Tom's argument seems to think it's impossible, so if you find that
> > it's definitely not impossible, then you can assume he's wrong about
> > that.
>
> My point was that it seems like you'd need a separate BMH engine for
> each %-separated segment of the LIKE pattern. I'm not quite clear on
> whether BMH can handle '_' (single-char wildcard) conveniently by
> itself, although my gut feel is that you can probably make that part
> work. Maybe you can even extend the idea to embedded %, but that
> seems more difficult.
Yeah, I think to make it work with more complex patterns like
'%some%string%' or '%some_string%' you'd need to break that into
multiple independent searches for each portion between a wildcard
character. For the former pattern, you'd need to do some final check
that ensures that the 2nd pattern was found in some position >= the
position of the 1st pattern + its length. For the latter, the final
check would need to validate that the 2nd pattern was found at a
position of the first pattern + its length + 1. It's probably also
possible to make those patterns work when they don't contain the
leading and trailing % by checking that the first pattern is found at
position 0 and the end of the 2nd pattern is found at the end of the
search string.
However, I imagine going to the trouble of trying to make it work for
more complex patterns initially would be a bad idea. I imagine there
are just too many cases where we could demonstrate performance
regressions and that would cause us to reject the patch.
David
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Sitnikov | 2023-02-09 05:41:46 | Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION? |
Previous Message | Tom Lane | 2023-02-09 01:49:43 | Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION? |