From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Ivan Panchenko <i(dot)panchenko(at)postgrespro(dot)ru> |
Cc: | pgsql-www(at)lists(dot)postgresql(dot)org |
Subject: | Re: Mailing list search engine: surprising missing results? |
Date: | 2022-01-25 17:54:28 |
Message-ID: | 2274255.1643133268@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-www |
Ivan Panchenko <i(dot)panchenko(at)postgrespro(dot)ru> writes:
> The actual explanation can be seen from comparing a tsvector with a tsquery.
> To avoid stemming effects, we use the simple configuration below.
> # select plainto_tsquery('simple','boyers-moore');
> plainto_tsquery
> -------------------------------------
> 'boyers-moore' & 'boyers' & 'moore'
> # select to_tsvector('simple','boyers-moore-horspool');
> to_tsvector
> -------------------------------------------------------------
> 'boyers':2 'boyers-moore-horspool':1 'horspool':4 'moore':3
> Obviously, such tsvector does not match the above tsquery. I think,a better tsquery for this query would be
> 'boyers-moore' | ('boyers' & 'moore')
> May be, it is worth changing to_tsquery() behavior for such cases.
Changing the behavior of to_tsquery is certainly a lot less scary
than changing to_tsvector --- it wouldn't call the validity of
existing tsvector indexes into question.
I see that to_tsquery is even sillier than plainto_tsquery:
regression=# select to_tsquery('simple','boyers-moore');
to_tsquery
-----------------------------------------
'boyers-moore' <-> 'boyers' <-> 'moore'
(1 row)
which is absolutely not a sane translation.
It seems to me that in both cases we'd be better off generating
"'boyers' <-> 'moore'", without the compound token at all.
Maybe there's a case for the weaker 'boyers' & 'moore' translation,
but I think if people wanted that they'd just enter separate words.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2022-01-25 20:48:18 | Re: [PATCHES] pglister: make organization name generic |
Previous Message | Magnus Hagander | 2022-01-25 17:03:59 | Re: Update Commitfest requirements and README |