Re: Clarification of the "simple" dictionary

From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Clarification of the "simple" dictionary
Date: 2010-07-22 17:32:42
Message-ID: 4C4880BA.2090302@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/22/2010 06:27 PM, John Gage wrote:
> The easiest way to look at this is to give the simple dictionary a
> document with to_tsvector() and see if stopwords pop out.
>
> In my experience they do. In my experience, the simple dictionary
> just breaks the document down into the space etc. separated words in
> the document. It doesn't analyze further.

That's my experience too, I just want to make sure it doesn't actually
have any stopwords which I've missed. Trying many phrases and checking
for stopwords isn't really proving anything.

Can anybody confirm the "simple" dict. only lowercases the words and
"uniques" them?

--
Andreas Joseph Krogh<andreak(at)officenet(dot)no>
Senior Software Developer / CTO
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Rosenholmveien 25 | know how to do a thing and to watch |
1414 Trollåsen | somebody else doing it wrong, without |
NORWAY | comment. |
| |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2010-07-22 17:44:38 Re: Clarification of the "simple" dictionary
Previous Message Derrick Rice 2010-07-22 16:27:31 Are identical subqueries in unioned statements nonrepeatable?