Re: Clarification of the "simple" dictionary

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Clarification of the "simple" dictionary
Date: 2010-07-22 19:28:13
Message-ID: Pine.LNX.4.64.1007222327270.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Andreas,

I'd create myself copy of dictionary to be independent on system changes.

Oleg
On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote:

> On 07/22/2010 07:44 PM, Oleg Bartunov wrote:
>> Don't guess, but read docs
>> http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY
>>
>> 12.6.2. Simple Dictionary
>>
>> The simple dictionary template operates by converting the input token to
>> lower case and checking it against a file of stop words. If it is found in
>> the file then an empty array is returned, causing the token to be
>> discarded. If not, the lower-cased form of the word is returned as the
>> normalized lexeme. Alternatively, the dictionary can be configured to
>> report non-stop-words as unrecognized, allowing them to be passed on to the
>> next dictionary in the list.
>>
>> d=# \dFd+ simple
>> List of text search dictionaries
>> Schema | Name | Template | Init options |
>> Description
>> ------------+--------+-------------------+--------------+-----------------------------------------------------------
>> pg_catalog | simple | pg_catalog.simple | | simple
>> dictionary: just lower case and check for stopword
>>
>> By default it has no Init options, so it doesn't check for stopwords.
>
> Guess what - I *have* read the docs which sais "...and checking it against a
> file of stop words". What was unclear to me was whether or not it was
> configured with a stopwords-file or not as default, which is not the case I
> understand from your reply. Very good, fits my needs like a glove:-) It might
> be worth considering updating the docs to make this clearer?
>
> So - can we rely on "simple" to remain this way forever (no Init options) or
> is it better to make a copy of it with the same properties as today?
>
> It seems "simple" + the unaccent dict. available in 9.0 saves my day, thanks
> Mr. Bartunov.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Szymon Guz 2010-07-22 19:34:00 Re: Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
Previous Message Piotr Gasidło 2010-07-22 19:24:52 Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime