From: | Tommy Gildseth <tommy(dot)gildseth(at)usit(dot)uio(dot)no> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Text search with ispell |
Date: | 2009-01-27 14:07:12 |
Message-ID: | 497F1510.9000400@usit.uio.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tommy Gildseth wrote:
> Oleg Bartunov wrote:
>> Have you read
>> http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY
>>
>> We suggest to use dictionaries which come with openoffice, hunspell,
>> probably
>> has better support of composite words.
>>
>
> Thanks, that knocked me onto the right track. To easy to miss the
> blindingly obvious at times. :-)
> Works beautifully now.
>
I may have been to quick to declare success.
The following works as expected, returning the individual words:
SELECT
ts_debug('norwegian', 'overbuljongterningpakkmesterassistent'),
ts_debug('norwegian', 'sjokoladefabrikk'),
ts_debug('norwegian', 'epleskrott');
-[ RECORD 1
]--------------------------------------------------------------------------------------------------------------------------------------------------
ts_debug | (asciiword,"Word, all
ASCII",overbuljongterningpakkmesterassistent,"{no_ispell,norwegian_stem}",no_ispell,"{buljong,terning,pakk,mester,assistent}")
ts_debug | (asciiword,"Word, all
ASCII",sjokoladefabrikk,"{no_ispell,norwegian_stem}",no_ispell,"{sjokoladefabrikk,sjokolade,fabrikk}")
ts_debug | (asciiword,"Word, all
ASCII",epleskrott,"{no_ispell,norwegian_stem}",no_ispell,"{epleskrott,eple,skrott}")
But, the following does not:
SELECT
ts_debug('norwegian', 'hemsedalsdans'),
ts_debug('norwegian', 'lærdalsbrua'),
ts_debug('norwegian', 'hengesmykke');
-[ RECORD 1
]----------------------------------------------------------------------------------------------------
ts_debug | (asciiword,"Word, all
ASCII",hemsedalsdans,"{no_ispell,norwegian_stem}",norwegian_stem,{hemsedalsdan})
ts_debug | (word,"Word, all
letters",lærdalsbrua,"{no_ispell,norwegian_stem}",norwegian_stem,{lærdalsbru})
ts_debug | (asciiword,"Word, all
ASCII",hengesmykke,"{no_ispell,norwegian_stem}",norwegian_stem,{hengesmykk})
Would this be due to a limitation in the dictionary, or a
misconfiguration on my side?
Commands used are as follows:
CREATE TEXT SEARCH DICTIONARY no_ispell (
TEMPLATE = ispell,
DictFile = nb_NO,
AffFile = nb_NO,
StopWords = norwegian
);
and
ALTER TEXT SEARCH CONFIGURATION norwegian ALTER MAPPING FOR asciiword,
asciihword, hword_asciipart,word, hword, hword_part WITH no_ispell,
norwegian_stem;
--
Tommy Gildseth
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2009-01-27 14:49:06 | Re: Text search with ispell |
Previous Message | Dave Page | 2009-01-27 13:25:53 | Re: Installing PostgreSQL on Windows 7 Beta Build 7000 64bit - first results |