From: | Andreas Joseph Krogh <andreak(at)officenet(dot)no> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Clarification of the "simple" dictionary |
Date: | 2010-07-22 17:32:42 |
Message-ID: | 4C4880BA.2090302@officenet.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 07/22/2010 06:27 PM, John Gage wrote:
> The easiest way to look at this is to give the simple dictionary a
> document with to_tsvector() and see if stopwords pop out.
>
> In my experience they do. In my experience, the simple dictionary
> just breaks the document down into the space etc. separated words in
> the document. It doesn't analyze further.
That's my experience too, I just want to make sure it doesn't actually
have any stopwords which I've missed. Trying many phrases and checking
for stopwords isn't really proving anything.
Can anybody confirm the "simple" dict. only lowercases the words and
"uniques" them?
--
Andreas Joseph Krogh<andreak(at)officenet(dot)no>
Senior Software Developer / CTO
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Rosenholmveien 25 | know how to do a thing and to watch |
1414 Trollåsen | somebody else doing it wrong, without |
NORWAY | comment. |
| |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2010-07-22 17:44:38 | Re: Clarification of the "simple" dictionary |
Previous Message | Derrick Rice | 2010-07-22 16:27:31 | Are identical subqueries in unioned statements nonrepeatable? |