From: | Oleg Bartunov <obartunov(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, James Addison <jay(at)jp-hosting(dot)net>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org> |
Subject: | Re: Mailing list search engine: surprising missing results? |
Date: | 2022-01-25 11:04:09 |
Message-ID: | CAF4Au4yttKJ1KAP-cO+HMLQ2_66vmx0dLTBUbE4W8Aa64foafg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-www |
On Mon, Jan 24, 2022 at 11:47 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Mon, Jan 24, 2022 at 08:27:41AM +0100, Laurenz Albe wrote:
> >> The reason is that the 'moore' in 'boyer-moore' is stemmed, since it
> >> is at the end of the word, while the 'moore' in 'Boyer-Moore-Horspool'
> >> isn't:
>
> > Wow, he showed me this problem earlier but I never suspected it was
> > stemming issue because I never considered proper nowns could be
> > stem-adjusted, but it is obvious they can.
>
> I wonder if we should change that so that components of a compound
> word are consistently stemmed the same way.
>
Something like this
SELECT to_tsvector('english', 'Boyer-Moore-Horspool');
to_tsvector
----------------------------------------------------------
'boyer':2 'boyer-moore-horspool':1 'boyer-moore':1 'moore-horspool':1
'horspool':4 'moor':3
(1 row)
>
> regards, tom lane
>
>
>
--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2022-01-25 12:43:48 | Re: Mailing list search engine: surprising missing results? |
Previous Message | Célestin Matte | 2022-01-25 09:36:21 | Re: [PATCHES] pglister: make organization name generic |