tsvector stemmer issue

From: Jeff Trout <threshar(at)real(dot)jefftrout(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: tsvector stemmer issue
Date: 2013-11-22 13:25:47
Message-ID: A9849DFF-42BD-493C-A8CF-156CA28C7E42@torgo.978.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

ran into an interesting issue - and I’m not sure if anything can be done about it - the snowball stemmer treats “severance” and “several” as the same, which for me is a big, big issue.

even quoting it doesn’t help.
indie=> select to_tsvector('severance several');
to_tsvector
-------------
'sever':1,2
(1 row)

indie=> select to_tsvector('"severance" several');
to_tsvector
-------------
'sever':1,2
(1 row)

using the perl library Lingua::Stem::Snowball it yields the same results (as expected since they both use snowball).

am I SOL here?


Jeff Trout <jeff(at)jefftrout(dot)com>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2013-11-22 13:37:59 Re: PG replication across DataCenters
Previous Message Janek Sendrowski 2013-11-22 13:11:43 include all the postgres libraries (C)