From: | Stanislav Raskin <raskin(at)livn(dot)de> |
---|---|
To: | <pgsql-general(at)postgresql(dot)org> |
Subject: | full text search to_tsquery performance with ispell dictionary |
Date: | 2011-05-11 11:19:38 |
Message-ID: | C9F03D6A.202C0%raskin@livn.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello everybody,
I was experimenting with the FTS feature on postgres 8.3.4 lately and
encountered a weird performance issue when using a custom FTS configuration.
I use this german ispell dictionary, re-encoded to utf8:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-g
erman-compound.tar.gz
With the following configuration:
CREATE TEXT SEARCH CONFIGURATION public.german_de (COPY =
pg_catalog.german);
CREATE TEXT SEARCH DICTIONARY german_de_ispell (
TEMPLATE = ispell,
DictFile = german_de_utf8,
AffFile = german_de_utf8,
StopWords = german_de_utf8
);
ALTER TEXT SEARCH CONFIGURATION german_de
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
word, hword, hword_part
WITH german_de_ispell, german_stem;
So far so good. Indexing and creation of tsvectors works like a charm.
The problem is, that if I open a new connection to the database and do
something like this
SELECT to_tsquery('german_de', 'abcd');
it takes A LOT of time for the query to complete for the first time. About
1-1,5s. If I submit the same query for a second, third, fourth time and so
on, it takes only some 10-20ms, which is what I would expect.
It almost seems as if the dictionary is somehow analyzed or indexed and the
results cached for each connection, which seems counter-intuitive to me.
After all, the dictionaries should not change that often.
Did I miss something or did I do something wrong?
I'd be thankful for any advice.
Kind Regards
--
Stanislav Raskin
livn GmbH
Campus Freudenberg
Rainer-Gruenter-Str. 21
42119 Wuppertal
+49(0)202-8 50 66 921
raskin(at)livn(dot)de
http://www.livn.de
livn
local individual video news GmbH
Registergericht Wuppertal HRB 20086
Geschäftsführer:
Dr. Stefan Brües
Alexander Jacob
From | Date | Subject | |
---|---|---|---|
Next Message | Tim Uckun | 2011-05-11 13:05:59 | Postgres federation |
Previous Message | Noah Misch | 2011-05-11 08:11:08 | Re: One-off attempt at catalog hacking to turn bytea column into text |