From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | a tsearch issue |
Date: | 2011-11-04 10:22:45 |
Message-ID: | CAFj8pRD1rTJaFEnEGjEfB9BCQMPD=w46z1vvO6xK=4Z8Dvhx+Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
I found a interesting issue when I checked a tsearch prefix searching.
We use a ispell based dictionary
CREATE TEXT SEARCH DICTIONARY cspell
(template=ispell, dictfile = czech, afffile=czech, stopwords=czech);
CREATE TEXT SEARCH CONFIGURATION cs (copy=english);
ALTER TEXT SEARCH CONFIGURATION cs
ALTER MAPPING FOR word, asciiword WITH cspell, simple;
Then I created a table
postgres=# create table n(a varchar);
CREATE TABLE
postgres=# insert into n values('Stěhule'),('Chromečka');
INSERT 0 2
postgres=# select * from n;
a
───────────
Stěhule
Chromečka
(2 rows)
and I tested a prefix searching:
I found a following issue
postgres=# select * from n where to_tsvector('cs', a) @@
to_tsquery('cs','Stě:*') ;
a
───
(0 rows)
I expected one row. The problem is in transformation of word 'Stě'
postgres=# select * from ts_debug('cs','Stě:*') ;
─[ RECORD 1 ]┬──────────────────
alias │ word
description │ Word, all letters
token │ Stě
dictionaries │ {cspell,simple}
dictionary │ cspell
lexemes │ {sto}
─[ RECORD 2 ]┼──────────────────
alias │ blank
description │ Space symbols
token │ :*
dictionaries │ {}
dictionary │ [null]
lexemes │ [null]
Ispell disctionary cannot to work well with a first n chars from word.
I don't know what is correct solution of this problem.
Minimally note in prefix search, so this cannot work well with *spell
dictionaries - or description of this issue.
Regards
Pavel Stehue
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2011-11-04 11:08:10 | Re: DeArchiver process |
Previous Message | Yoann Moreau | 2011-11-04 10:15:15 | Re: Term positions in GIN fulltext index |