Re: Very bad FTS performance with the Polish config

From: Wojciech Knapik <webmaster(at)wolniartysci(dot)pl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Very bad FTS performance with the Polish config
Date: 2009-11-18 19:13:10
Message-ID: 4B044746.3090604@wolniartysci.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Oleg Bartunov wrote:

>> Yes, for 4-word texts the results are similar.
>> Try that with a longer text and the difference becomes more and more
>> significant. For the lorem ipsum text, 'polish' is about 4 times
>> slower, than 'english'. For 5 repetitions of the text, it's 6 times,
>> for 10 repetitions - 7.5 times...
>
> Again, I see nothing unclear here, since dictionaries (as specified
> in configuration) apply to ALL words in document. The more words in
> document, the more overhead.

You're missing the point. I'm not surprised that the function takes more
time for larger input texts - that's obvious. The thing is, the
computation times rise more steeply when the Polish config is used.
Steeply enough, that the difference between the Polish and English
configs becomes enormous in practical cases.

Now this may be expected behaviour, but since I don't know if it is, I
posted to the mailing lists to find out. If you're saying this is ok and
there's nothing to fix here, then there's nothing more to discuss and we
may consider the thread closed.
If not, ts_headline deserves a closer look.

cheers,
Wojciech Knapik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Boley 2009-11-18 19:32:44 Re: Python 3.1 support
Previous Message Kevin Grittner 2009-11-18 18:26:14 Re: Timezones (in 8.5?)