From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Pavel Stehule <stehule(at)kix(dot)fsv(dot)cvut(dot)cz> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: questions about tsearch2 (for czech language) |
Date: | 2003-12-22 11:05:59 |
Message-ID: | Pine.GSO.4.58.0312221401080.14104@ra.sai.msu.su |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 22 Dec 2003, Pavel Stehule wrote:
> Hello
>
> I try tsearch2 within czech environment. It is works fine, but I have two
> questions.
>
> 1. I have words "se", "ve" in my czech stop words. But I get this words in
> result. Why? Have I problem with my configuration?
did you specify stop words in dictionaries configuration ?
select * from pg_ts_dict;
>
> tsearch2=# select * from ts_debug('jmenuji se Pavel StЛhule a bydlМm ve
> Skalici.');
> ts_name | tok_type | description | token | dict_name | tsvector
> ---------------+----------+-------------+---------+-------------+-----------
> default_czech | lword | Latin word | jmenuji | {cz_ispell} |
> 'jmenuji'
> default_czech | lword | Latin word | se | {cz_ispell} | 'se'
> default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel'
> default_czech | word | Word | StЛhule | {cz_ispell} |
> default_czech | lword | Latin word | a | {cz_ispell} |
> default_czech | word | Word | bydlМm | {cz_ispell} | 'bydlet'
> default_czech | lword | Latin word | ve | {cz_ispell} | 've'
> default_czech | lword | Latin word | Skalici | {cz_ispell} |
> 'skalici'
> (8 ЬАdek)
>
> tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech';
> ts_name | tok_alias | dict_name
> ---------------+--------------+-------------
> default_czech | email | {simple}
> default_czech | file | {simple}
> default_czech | float | {simple}
> default_czech | host | {simple}
> default_czech | hword | {cz_ispell}
> default_czech | int | {simple}
> default_czech | lhword | {cz_ispell}
> default_czech | lpart_hword | {cz_ispell}
> default_czech | lword | {cz_ispell}
> default_czech | nlhword | {cz_ispell}
> default_czech | nlpart_hword | {cz_ispell}
> default_czech | nlword | {cz_ispell}
> default_czech | part_hword | {simple}
> default_czech | sfloat | {simple}
> default_czech | uint | {simple}
> default_czech | uri | {simple}
> default_czech | url | {simple}
> default_czech | version | {simple}
> default_czech | word | {cz_ispell}
> (19 ЬАdek)
>
> 2. I use small czech dictionary. I need don't erase words which aren't in
> dictionary (in my sample StЛhule). Can I set it somewhere? I tryed add
> simple dict into cfg map, but witout sucess
>
Example, please ! What do you mean 'erase words' ?
> tsearch2=# select * from ts_debug('jmenuji se Pavel StЛhule a bydlМm ve
> Skalici.'); ts_name | tok_type | description | token |
> dict_name | tsvector
> ---------------+----------+-------------+---------+--------------------+-----------
> default_czech | word | Word | StЛhule | {cz_ispell,simple} |
> default_czech | lword | Latin word | a | {cz_ispell,simple} |
> default_czech | word | Word | bydlМm | {cz_ispell,simple} |
> 'bydlet'
>
>
> Thank You
> Pavel Stehule
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | javier garcia - CEBAS | 2003-12-22 11:18:21 | extracting date FROM timestamp |
Previous Message | Tony (Unihost) | 2003-12-22 10:57:43 | Tables Referencing themselves As Foreign Keys |