From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tommy Gildseth <tommy(dot)gildseth(at)usit(dot)uio(dot)no> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Text search with ispell |
Date: | 2009-01-27 14:49:06 |
Message-ID: | Pine.LNX.4.64.0901271746420.9554@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, 27 Jan 2009, Tommy Gildseth wrote:
> Tommy Gildseth wrote:
>> Oleg Bartunov wrote:
>>> Have you read
>>> http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY
>>> We suggest to use dictionaries which come with openoffice, hunspell,
>>> probably
>>> has better support of composite words.
>>>
>>
>> Thanks, that knocked me onto the right track. To easy to miss the
>> blindingly obvious at times. :-)
>> Works beautifully now.
>>
>
> I may have been to quick to declare success.
>
> The following works as expected, returning the individual words:
> SELECT
> ts_debug('norwegian', 'overbuljongterningpakkmesterassistent'),
> ts_debug('norwegian', 'sjokoladefabrikk'),
> ts_debug('norwegian', 'epleskrott');
> -[ RECORD 1
> ]--------------------------------------------------------------------------------------------------------------------------------------------------
> ts_debug | (asciiword,"Word, all
> ASCII",overbuljongterningpakkmesterassistent,"{no_ispell,norwegian_stem}",no_ispell,"{buljong,terning,pakk,mester,assistent}")
> ts_debug | (asciiword,"Word, all
> ASCII",sjokoladefabrikk,"{no_ispell,norwegian_stem}",no_ispell,"{sjokoladefabrikk,sjokolade,fabrikk}")
> ts_debug | (asciiword,"Word, all
> ASCII",epleskrott,"{no_ispell,norwegian_stem}",no_ispell,"{epleskrott,eple,skrott}")
>
>
> But, the following does not:
> SELECT
> ts_debug('norwegian', 'hemsedalsdans'),
> ts_debug('norwegian', 'l?rdalsbrua'),
> ts_debug('norwegian', 'hengesmykke');
> -[ RECORD 1
> ]----------------------------------------------------------------------------------------------------
> ts_debug | (asciiword,"Word, all
> ASCII",hemsedalsdans,"{no_ispell,norwegian_stem}",norwegian_stem,{hemsedalsdan})
> ts_debug | (word,"Word, all
> letters",l?rdalsbrua,"{no_ispell,norwegian_stem}",norwegian_stem,{l?rdalsbru})
> ts_debug | (asciiword,"Word, all
> ASCII",hengesmykke,"{no_ispell,norwegian_stem}",norwegian_stem,{hengesmykk})
>
>
> Would this be due to a limitation in the dictionary, or a misconfiguration on
> my side?
sorry, I don't know norwegian, what do you mean ? Did you complain that
no_ispell doesn't recognize these words ?
>
> Commands used are as follows:
>
> CREATE TEXT SEARCH DICTIONARY no_ispell (
> TEMPLATE = ispell,
> DictFile = nb_NO,
> AffFile = nb_NO,
> StopWords = norwegian
> );
>
> and
>
> ALTER TEXT SEARCH CONFIGURATION norwegian ALTER MAPPING FOR asciiword,
> asciihword, hword_asciipart,word, hword, hword_part WITH no_ispell,
> norwegian_stem;
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-01-27 15:09:08 | Re: Varchar vs text |
Previous Message | Tommy Gildseth | 2009-01-27 14:07:12 | Re: Text search with ispell |