Re: Mailing list search engine: surprising missing results?

From: James Addison <jay(at)jp-hosting(dot)net>
To: Ivan Panchenko <i(dot)panchenko(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Mailing list search engine: surprising missing results?
Date: 2022-01-26 08:28:43
Message-ID: CALDQ5NwjHE6jjmxVPSq00FbTiVVKcb9+fX7nMnrRXtHNZGt+2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Tue, 25 Jan 2022 at 21:23, Ivan Panchenko <i(dot)panchenko(at)postgrespro(dot)ru> wrote:
>
> On 25.01.2022 23:48, James Addison wrote:
> > I'm uncertain why parsing hyphenated query text produces compound tokens?
>
> Because in some cases user wants to search the full hyphenated words,
> not parts of them.

That makes sense, although to refer back to a previous suggestion of
yours, we could allow matching on the full hyphenated words by
emitting an 'OR' condition from the parsed query, instead of 'AND'
(perhaps using an argument?).

In other words:

# expected query to achieve a match (from your previous post in this thread)
'boyers-moore' | ('boyers' & 'moore')

# actual query that does not result in a match today (plainto_tsquery
for 'boyer-moore')
'boyer-moore' & 'boyer' & 'moore'

> >> It seems to me that in both cases we'd be better off generating
> >> "'boyers' <-> 'moore'", without the compound token at all.
> >> Maybe there's a case for the weaker 'boyers' & 'moore' translation,
> >> but I think if people wanted that they'd just enter separate words.
>
> Matching the compond token might be significant for ranking. (?)

Yes that does seem likely. The knowledge that there is an exact-match
token in the results could be important for various use cases
(including relevance scoring).

> Probably, there is no universal *to_tsquery function and no universal
> parser to fit all users.

That seems possible too, yep.

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Eric Feng 2022-01-27 03:23:30 Wiki editor request
Previous Message Ivan Panchenko 2022-01-25 21:23:35 Re: Mailing list search engine: surprising missing results?