From: | "Yishai Lerner" <yish(at)alum(dot)mit(dot)edu> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #4306: TSearch2 stemming, stop words and lexize behaviour inconsistent |
Date: | 2008-07-14 21:04:41 |
Message-ID: | 200807142104.m6EL4fcq051121@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 4306
Logged by: Yishai Lerner
Email address: yish(at)alum(dot)mit(dot)edu
PostgreSQL version: 8.3.1
Operating system: RHEL5 and MacOSX 10.4
Description: TSearch2 stemming, stop words and lexize behaviour
inconsistent
Details:
I would expect the behavior for to_tsquery for the three variations of
"what", "what's" and "whats" to be consistent and for all variations to be
ignored since they all result in a stop word of "what". However, this is
not the case as to_tsquery("whats") returns the stop word "what" as a
result. Even more confusing is that if one were to look at the lexize
results below, they are inconsistent with the to_tsquery results below.
This seems like a bug to me.
goodrec_2=# select lexize('en_stem', 'what''s');
lexize
--------
{what}
goodrec_2=# select lexize('en_stem', 'whats');
lexize
--------
{what}
goodrec_2=# select lexize('en_stem', 'what');
lexize
--------
{}
goodrec_2=# select to_tsquery('what''s');
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
ignored
to_tsquery
goodrec_2=# select to_tsquery('whats');
to_tsquery
------------
'what'
goodrec_2=# select to_tsquery('what');
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
ignored
From | Date | Subject | |
---|---|---|---|
Next Message | Thibauld Favre | 2008-07-14 22:22:47 | Re: BUG #4286: ORDER BY returns inconsistent results when using LIMIT on a integer column set to default values |
Previous Message | Tom Lane | 2008-07-14 14:30:28 | Re: BUG #4296: Server crashes by restoring database |