Re: full text search to_tsquery performance with ispell dictionary

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Stanislav Raskin <raskin(at)livn(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: full text search to_tsquery performance with ispell dictionary
Date: 2011-05-11 13:45:55
Message-ID: BANLkTik9pr1B1zy+0kFqmSbFv3rVMYV=Kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello

2011/5/11 Stanislav Raskin <raskin(at)livn(dot)de>:
> Hello everybody,
> I was experimenting with the FTS feature on postgres 8.3.4 lately and
> encountered a weird performance issue when using a custom FTS configuration.
> I use this german ispell dictionary, re-encoded to utf8:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
> With the following configuration:
>
> CREATE TEXT SEARCH CONFIGURATION public.german_de (COPY =
> pg_catalog.german);
>
> CREATE TEXT SEARCH DICTIONARY german_de_ispell (
>
>     TEMPLATE = ispell,
>
>     DictFile = german_de_utf8,
>
>     AffFile = german_de_utf8,
>
>     StopWords = german_de_utf8
>
> );
>
> ALTER TEXT SEARCH CONFIGURATION german_de
>
>     ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
>
>                       word, hword, hword_part
>
>     WITH german_de_ispell, german_stem;
>
> So far so good. Indexing and creation of tsvectors works like a charm.
> The problem is, that if I open a new connection to the database and do
> something like this
> SELECT to_tsquery('german_de', 'abcd');
> it takes A LOT of time for the query to complete for the first time. About
> 1-1,5s. If I submit the same query for a second, third, fourth time and so
> on, it takes only some 10-20ms, which is what I would expect.
> It almost seems as if the dictionary is somehow analyzed or indexed and the
> results cached for each connection, which seems counter-intuitive to me.
> After all, the dictionaries should not change that often.
> Did I miss something or did I do something wrong?
> I'd be thankful for any advice.
> Kind Regards

it is expected behave :( . A loading of ispell dictionary is very slow.

Use a german snowball instead.

You can you a some pooling connection software too.

Regards

Pavel Stehule

> --
>
> Stanislav Raskin
>
> livn GmbH
> Campus Freudenberg
> Rainer-Gruenter-Str. 21
> 42119 Wuppertal
>
> +49(0)202-8 50 66 921
> raskin(at)livn(dot)de
> http://www.livn.de
>
> livn
> local individual video news GmbH
> Registergericht Wuppertal HRB 20086
>
> Geschäftsführer:
> Dr. Stefan Brües
> Alexander Jacob

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2011-05-11 13:48:25 Re: full text search to_tsquery performance with ispell dictionary
Previous Message Tim Uckun 2011-05-11 13:05:59 Postgres federation