From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Jérôme Etévé <jerome(dot)eteve(at)gmail(dot)com> |
Cc: | Michael Nacos <m(dot)nacos(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: What is the simpliest text search configuration? |
Date: | 2009-11-12 16:24:03 |
Message-ID: | Pine.LNX.4.64.0911121923190.6801@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
We submitted unaccent dictionary for 8.5
See http://www.sai.msu.su/~megera/wiki/unaccent for some information
Oleg
On Thu, 12 Nov 2009, Jrme Etv wrote:
> Hi Michael,
>
> I actually found that the 'simple' dictionary doesn't enforce a
> stopword list by default. so i defined my search conf like this and it
> works:
>
> create text search configuration sbsimple ( parser = 'default' ) ;
> alter text search configuration sbsimple ALTER MAPPING FOR
> word,hword,asciiword,asciihword WITH simple
>
> Cheers!
>
> J.
>
> 2009/11/12 Michael Nacos <m(dot)nacos(at)gmail(dot)com>:
>> Dear Jerome,
>>
>> from personal experience full-text searching in PostgreSQL can be quite
>> powerful
>> but it's not simple, it requires thought, planning and coding. PostgreSQL
>> mainly
>> provides an efficient token matching mechanism supporting positional
>> information
>> and weights, but natural language processing and normalization is pretty
>> basic.
>>
>> If you don't mind writing a couple of user-defined functions to take control
>> of lexeme
>> normalization, then tsvector/tsquery support can be a very powerful tool for
>> custom
>> search engines.
>>
>> regards,
>>
>> Michael
>>
>> 2009/11/12 JЪЪrЪЪme EtЪЪvЪЪ <jerome(dot)eteve(at)gmail(dot)com>
>>>
>>> Hi all,
>>>
>>> I'd like to implement a full text search with postgresql, and I can't
>>> find
>>> a text search configuration that would just:
>>>
>>> map unicode accentuated letters to an un-accentuated equivalent
>>> tokenize the words (and skip any non word characters)
>>> no stopwords
>>> lower case the tokens
>>>
>>> How can I achieve this? I'm particularly interested in deactivating
>>> the stopwords filtering.
>>>
>>> I tried pg_catalog.simple, but despite its name, it still considers stop
>>> words.
>>>
>>> Thanks for your help!
>>>
>>> Jerome.
>>>
>>
>>
>
>
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Kretschmer | 2009-11-12 16:38:23 | Re: re-using RETURNING |
Previous Message | Tom Lane | 2009-11-12 15:20:33 | Re: What is the simpliest text search configuration? |