Re: Solved: questions about tsearch2 (for czech language)

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pavel Stehule <stehule(at)kix(dot)fsv(dot)cvut(dot)cz>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, pgsql-general(at)postgresql(dot)org, openfts discussion <openfts-general(at)lists(dot)sourceforge(dot)net>
Subject: Re: Solved: questions about tsearch2 (for czech language)
Date: 2003-12-23 09:08:25
Message-ID: 3FE80609.3090408@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> You has true. After restart of postmaster all works fine.
One comment, you don't need restart postmaster, you should reconnect to
postgresql by exit and start psql. Every new connect creates new child of
postmaster.

>
> tsearch2=# select to_tsvector('default_czech','Jmenuji se Pavel Stěhule');
> to_tsvector
> ------------------------------------
> 'pavel':3 'stěhule':4 'jmenovat':1
>
> Thank You very much
>
> Pavel Stehule
>
>
> On Mon, 22 Dec 2003, Oleg Bartunov wrote:
>
>
>>Pavel,
>>
>>did you restart psql session after modifying tsearch2 configuration ?
>>btw, there is czech dictionary available from http://lingucomponent.openoffice.org/download_dictionary.html
>>We have utility to convert myspell dicts to ispell one. It's included
>>in 7.5 development. Patch for 7.4 could be downloaded from
>>http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
>>
>>Also, historically, we use openfts mailing list for discussion of
>>tsearch2.
>>
>> Oleg
>>On Mon, 22 Dec 2003, Pavel Stehule wrote:
>>
>>
>>>>>result. Why? Have I problem with my configuration?
>>>>
>>>>did you specify stop words in dictionaries configuration ?
>>>>
>>>>select * from pg_ts_dict;
>>>>
>>>
>>>tsearch2=# select * from pg_ts_dict where dict_name ='cz_ispell';
>>>-[ RECORD 1
>>>]---+--------------------------------------------------------------------------------------------------------------------------
>>>dict_name | cz_ispell
>>>dict_init | 173405
>>>dict_initoption |
>>>DictFile="/usr/lib/ispell/czech",AffFile="/usr/lib/ispell/czech.aff",StopFile="/usr/local/pgsql/share/contrib/czech.stop"
>>>dict_lexize | 173406
>>>dict_comment |
>>>
>>>[postgres(at)usop root]$ cat /usr/local/pgsql/share/contrib/czech.stop|grep -e "^[sv]."
>>>se
>>>sem
>>>si
>>>svůj
>>>ve
>>>vám
>>>váš
>>>viz
>>>vy
>>>
>>>
>>>>>2. I use small czech dictionary. I need don't erase words which aren't in
>>>>>dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add
>>>>>simple dict into cfg map, but witout sucess
>>>>>
>>>>
>>>>Example, please ! What do you mean 'erase words' ?
>>>>
>>>>
>>>>
>>>>>tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve
>>>>>Skalici.'); ts_name | tok_type | description | token |
>>>>>dict_name | tsvector
>>>>>---------------+----------+-------------+---------+--------------------+-----------
>>>>> default_czech | word | Word | Stěhule | {cz_ispell,simple} |
>>>>> default_czech | lword | Latin word | a | {cz_ispell,simple} |
>>>>> default_czech | word | Word | bydlím | {cz_ispell,simple} |
>>>>>'bydlet'
>>>>>
>>>>>
>>>
>>>If tsearch didn't find word in dictionary, then erase this from result.
>>>True? My surname, fo example isn't in dictionary, but I wont save this
>>>word in result (tsvector).
>>>
>>>I use
>>>
>>>tsearch2=# select version();
>>> version
>>>-------------------------------------------------------------------------------------------------------
>>> PostgreSQL 7.4RC2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3
>>>20030715 (Red Hat Linux 3.3-14)
>>>
>>>
>>
>> Regards,
>> Oleg
>>_____________________________________________________________
>>Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>>Sternberg Astronomical Institute, Moscow University (Russia)
>>Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>>phone: +007(095)939-16-83, +007(095)939-23-83
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: the planner will ignore your desire to choose an index scan if your
>> joining column's datatypes do not match
>>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2003-12-23 09:15:10 Re: Solved: questions about tsearch2 (for czech language)
Previous Message John Sidney-Woollett 2003-12-23 08:50:12 Re: Normalization and regexp