From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
Cc: | mnelson(at)binarykeep(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Text search prefix matching and stop words |
Date: | 2021-10-08 21:06:27 |
Message-ID: | 8755.1633727187@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> writes:
>> Prefix matching should not omit stop words, as matching lexemes may
>> legitimately begin with stop words.
> I am not sure that it is a bug. I think this is a way how to_tsquery
> conversion work: stopwords first then template processing.
I concur with the OP that this is a bug, or at least that it'd be nice
if it worked better. But I'm not sure we can make it better. The basic
design of our text search stuff combined the functions of normalization
and stop-word-suppression into a single dictionary stack, so that it's
impossible to ask for just one of those to happen. But if we skip
applying the dictionaries at all for a prefix item, then word
normalization doesn't happen, which would create a different set of
unexpected-failure-to-match conditions. (So your proposed workaround
of casting directly to tsquery just moves the problem somewhere else.)
I think we could only fix this with a dictionary API change that
allows telling the dictionaries not to suppress stopwords. Not
sure how practical that is. If we'd had the prefix-match feature
from the beginning, maybe it'd have occurred to us that we needed
that API option ... but we didn't.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Francisco Olarte | 2021-10-10 07:26:29 | Re: GROUP BY using tablename.* does not work if tablename has 1 column with NULL values |
Previous Message | Jeff Davis | 2021-10-08 20:55:01 | GetSharedSecurityLabel() should be callable before shared relcaches are available |