| From: | Bruce Momjian <bruce(at)momjian(dot)us> |
|---|---|
| To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
| Cc: | James Addison <jay(at)jp-hosting(dot)net>, pgsql-www(at)postgresql(dot)org |
| Subject: | Re: Mailing list search engine: surprising missing results? |
| Date: | 2022-01-24 19:28:00 |
| Message-ID: | Ye79wNIXsyhwwwce@momjian.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-www |
On Mon, Jan 24, 2022 at 08:27:41AM +0100, Laurenz Albe wrote:
> On Sun, 2022-01-23 at 12:49 +0000, James Addison wrote:
> > Specializing the query further and searching for
> > "boyer-moore-horspool"[4] *does* again return results -- two documents
> > -- with the terms "boyer" and "horspool" highlighted.
>
> This is caused by the peculiarities of PostgreSQL full text search:
>
> SELECT to_tsvector('english', 'Boyer-Moore-Horspool')
> @@ websearch_to_tsquery('english', 'boyer-moore');
>
> ?column?
> ══════════
> f
> (1 row)
>
> The reason is that the 'moore' in 'boyer-moore' is stemmed, since it
> is at the end of the word, while the 'moore' in 'Boyer-Moore-Horspool'
> isn't:
Wow, he showed me this problem earlier but I never suspected it was
stemming issue because I never considered proper nowns could be
stem-adjusted, but it is obvious they can.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2022-01-24 20:47:29 | Re: Mailing list search engine: surprising missing results? |
| Previous Message | Magnus Hagander | 2022-01-24 12:51:25 | Re: CoC translations and Annual Reports |