Re: Full text search - wildcard and a stop word

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Allan Jardine <allan(dot)jardine(at)sprymedia(dot)co(dot)uk>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Full text search - wildcard and a stop word
Date: 2022-02-22 15:56:23
Message-ID: 705272.1645545383@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Allan Jardine <allan(dot)jardine(at)sprymedia(dot)co(dot)uk> writes:
> => select to_tsquery('all:*');
> NOTICE: text-search query contains only stop words or doesn't contain
> lexemes, ignored
> to_tsquery
> ------------
> (1 row)

> I get why that is happening - the notification basically details it, but
> the wildcard at the end seems to me that it should return `'all':*` in this
> case? Is this by design or could it be considered a bug?

It's a hard problem. If we don't normalize the presented word, we risk
not matching cases that users would expect to match (because the word
is going to be compared to data that probably *was* normalized).

In this particular case, you can skip the normalization by just not
using to_tsquery:

n=# select 'all:*'::tsquery;
tsquery
---------
'all':*
(1 row)

but that might or might not be what you want in general.

Perhaps the ideal behavior here would be "normalize, but don't throw away
stopwords", but unfortunately our dictionary APIs don't support that.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Torsten Förtsch 2022-02-22 20:27:39 AWS vs GCP storage
Previous Message Allan Jardine 2022-02-22 15:22:41 Full text search - wildcard and a stop word