From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | sushant354(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: lexemes in prefix search going through dictionary modifications |
Date: | 2011-10-25 16:05:46 |
Message-ID: | 7407A709-87E3-484D-9E7B-CCBAE7187BF9@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Oct25, 2011, at 17:26 , Sushant Sinha wrote:
> I am currently using the prefix search feature in text search. I find
> that the prefix characters are treated the same as a normal lexeme and
> passed through stemming and stopword dictionaries. This seems like a bug
> to me.
Hm, I don't think so. If they don't pass through stopword dictionaries,
then queries containing stopwords will fail to find any rows - which is
probably not what one would expect.
Here's an example:
Query for records containing the* and car*. The @@-operator returns true,
because the stopword is removed from both the tsvector and the tsquery
(the 'english' dictionary drops 'these' as a stopward and stems 'cars' to
'car. Both the tsvector and the query end up being just 'car')
postgres=# select to_tsvector('english', 'these cars') @@ to_tsquery('english', 'the:* & car:*');
?column?
----------
t
(1 row)
Here what happens stopwords aren't removed from the query
(Now, the tsvector ends up being 'car', but the query is 'the:* & car:*')
postgres=# select to_tsvector('english', 'these cars') @@ to_tsquery('simple', 'the:* & car:*');
?column?
----------
f
(1 row)
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2011-10-25 16:12:08 | Collect frequency statistics for arrays |
Previous Message | Sushant Sinha | 2011-10-25 15:26:18 | lexemes in prefix search going through dictionary modifications |