From: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
---|---|
To: | mnelson(at)binarykeep(dot)com |
Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Text search prefix matching and stop words |
Date: | 2021-10-08 20:30:41 |
Message-ID: | CALT9ZEG-i0prBw5N7pMAPqL_Kj=g_xK-oKjumE6-q0TVvOfB4A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
>
> Prefix matching should not omit stop words, as matching lexemes may
> legitimately begin with stop words.
>
> # select to_tsquery('english', 'over:*') @@ to_tsvector('english',
> 'overhaul');
> NOTICE: text-search query contains only stop words or doesn't contain
> lexemes, ignored
> ?column?
> ----------
> f
> (1 row)
>
> I noticed this after implementing interactive, incremental search in an
> application. As the user typed "overhaul," with each successive character
> executing a search, "ove" and "overh" matched a particular document, but
> "over" did not.
Big thanks for the reporting!
I am not sure that it is a bug. I think this is a way how to_tsquery
conversion work: stopwords first then template processing.
If you want to process successive characters typing, you can use casting to
tsvector type until input is not finished
'over:*'::tsquery;
and when the user finishes input then process the result via to_tsquery
with stop words.
if we do to_tsquery in a way you described I expect it will never apply the
stop-word filter on templated input as it can not be compared to stop words.
--
Best regards,
Pavel Borisov
Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Borisov | 2021-10-08 20:32:28 | Re: Text search prefix matching and stop words |
Previous Message | Matthew Nelson | 2021-10-08 18:17:16 | Text search prefix matching and stop words |