From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | James Addison <jay(at)jp-hosting(dot)net>, pgsql-www(at)postgresql(dot)org |
Subject: | Re: Mailing list search engine: surprising missing results? |
Date: | 2022-01-24 07:27:41 |
Message-ID: | ab4184b7ab84623be10c4676e090cc27ae78b355.camel@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-www |
On Sun, 2022-01-23 at 12:49 +0000, James Addison wrote:
> Hello,
>
> I noticed that the mailing list search engine[1] seems to unexpectedly
> miss results for some queries.
>
> For example:
>
> A search for "boyer"[2] returns five results, including result
> snippets that contain the text "Boyer-More-Horspool" [sic] and
> "Boyer-Moore-Horspool".
>
> However, a more specific search for "boyer-moore"[3] does not return
> any results -- that seems surprising.
>
> Specializing the query further and searching for
> "boyer-moore-horspool"[4] *does* again return results -- two documents
> -- with the terms "boyer" and "horspool" highlighted.
This is caused by the peculiarities of PostgreSQL full text search:
SELECT to_tsvector('english', 'Boyer-Moore-Horspool')
@@ websearch_to_tsquery('english', 'boyer-moore');
?column?
══════════
f
(1 row)
The reason is that the 'moore' in 'boyer-moore' is stemmed, since it
is at the end of the word, while the 'moore' in 'Boyer-Moore-Horspool'
isn't:
SELECT to_tsvector('english', 'Boyer-Moore-Horspool');
to_tsvector
══════════════════════════════════════════════════════════
'boyer':2 'boyer-moore-horspool':1 'horspool':4 'moor':3
(1 row)
SELECT websearch_to_tsquery('english', 'boyer-moore');
websearch_to_tsquery
═════════════════════════════════════
'boyer-moor' <-> 'boyer' <-> 'moor'
(1 row)
'boyer-moor' is not present in the first result.
As a workaround, I suggest that you search for 'boyer moore'
or (even better) '"boyer moore"' (with the double quotes):
SELECT websearch_to_tsquery('english', 'boyer moore');
websearch_to_tsquery
══════════════════════
'boyer' & 'moor'
(1 row)
SELECT websearch_to_tsquery('english', '"boyer moore"');
websearch_to_tsquery
══════════════════════
'boyer' <-> 'moor'
(1 row)
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | Umair Shahid | 2022-01-24 12:03:52 | CoC translations and Annual Reports |
Previous Message | James Addison | 2022-01-23 12:49:07 | Mailing list search engine: surprising missing results? |