Re: [GENERAL] Creation of tsearch2 index is very slow

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ron <rjpeace(at)earthlink(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [GENERAL] Creation of tsearch2 index is very slow
Date: 2006-01-21 13:29:13
Message-ID: Pine.GSO.4.63.0601211619541.14417@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Sat, 21 Jan 2006, Martijn van Oosterhout wrote:

> However, IMHO, this algorithm is optimising the wrong thing. It
> shouldn't be trying to split into sets that are far apart, it should be
> trying to split into sets that minimize the number of set bits (ie
> distance from zero), since that's what's will speed up searching.

Martijn, you're right! We want not only to split page to very
different parts, but not to increase the number of sets bits in
resulted signatures, which are union (OR'ed) of all signatures
in part. We need not only fast index creation (thanks, Tom !),
but a better index. Some information is available here
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
There are should be more detailed document, but I don't remember where:)

> That's harder though (this algorithm does approximate it sort of)
> and I havn't come up with an algorithm yet

Don't ask how hard we thought :)

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2006-01-21 13:34:38 Re: [GENERAL] Creation of tsearch2 index is very
Previous Message Sander Steffann 2006-01-21 13:09:54 Re: RAID 5 and postgresql

Browse pgsql-performance by date

  From Date Subject
Next Message Oleg Bartunov 2006-01-21 13:34:38 Re: [GENERAL] Creation of tsearch2 index is very
Previous Message K C Lau 2006-01-21 13:12:47 Re: SELECT MIN, MAX took longer time than SELECT