From: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vtamara(at)pasosdeJesus(dot)org |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #13690: Full Text Search with spanish dictionary cannot find some words |
Date: | 2015-10-22 10:16:54 |
Message-ID: | 5628B796.1080305@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
20.10.2015 19:21, Tom Lane пишет:
> This is because you didn't adjust the wildcard search pattern for the
> different stemming rules used in Spanish. Look at the to_tsvector and
> to_tsquery results:
>
> regression=# SELECT to_tsvector('english', nombre) , to_tsquery('english','politi:*') from cat;
> to_tsvector | to_tsquery
> -------------------------+------------
> 'politica':1 'social':2 | 'politi':*
> (1 row)
>
> regression=# SELECT to_tsvector('spanish', nombre) , to_tsquery('spanish','politi:*') from cat;
> to_tsvector | to_tsquery
> ----------------------+------------
> 'polit':1 'social':2 | 'politi':*
> (1 row)
>
> I don't know enough Spanish to follow the reasoning for stemming
> "politica" as "polit" rather than something else; but I do see that
> "politi" is not reduced to "polit", which is fairly reasonable since
> that's not a word. "politi:*" will match anything whose stemmed
> version starts with "politi", but that's too long ...
Tom is right. You cannot change stemming rules used in Spanish. But on
the other hand you can create the synonym dictionary.
First, you need create file $SHAREDIR/tsearch_data/spanish.syn with the
following entry:
politi polit
Then execute the following script:
CREATE TEXT SEARCH DICTIONARY spanish_synonym (
TEMPLATE = synonym,
SYNONYMS = spanish
);
ALTER TEXT SEARCH CONFIGURATION spanish
ALTER MAPPING FOR asciiword
WITH spanish_synonym, spanish_stem;
After that the following query:
SELECT COUNT(*) FROM cat
WHERE to_tsvector('spanish', nombre) @@ to_tsquery('spanish',
'politi:*');
will return 1.
You can read about synonym dictionary in the documentation:
http://www.postgresql.org/docs/devel/static/textsearch-dictionaries.html
--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Artur Zakirov | 2015-10-22 10:39:51 | Re: BUG #12857: Our company want to create dictionary |
Previous Message | Francisco Olarte | 2015-10-22 08:52:23 | Re: pg_rewind exiting with error code 1 when source and target are on the same timeline |