From: | Benjamin Arai <benjamin(at)araisoft(dot)com> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Postgresql <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: multi terabyte fulltext searching |
Date: | 2007-03-21 16:01:31 |
Message-ID: | 939DC5F2-448B-4CC9-A1F4-891329172F67@araisoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
By the way, what is the largest TSearch2 database that you know of
and how fast does it return results? Maybe my expectations are
unrealistic.
Benjamin
On Mar 21, 2007, at 8:42 AM, Oleg Bartunov wrote:
> Benjamin,
>
> as one of the author of tsearch2 I'd like to know more about your
> setup.
> tsearch2 in 8.2 has GIN index support, which scales much better
> than old
> GiST index.
>
> Oleg
>
> On Wed, 21 Mar 2007, Benjamin Arai wrote:
>
>> Hi,
>>
>> I have been struggling with getting fulltext searching for very
>> large databases. I can fulltext index 10s if gigs without any
>> problem but when I start geting to hundreds of gigs it becomes
>> slow. My current system is a quad core with 8GB of memory. I
>> have the resource to throw more hardware at it but realistically
>> it is not cost effective to buy a system with 128GB of memory. Is
>> there any solutions that people have come up with for indexing
>> very large text databases?
>>
>> Essentially I have several terabytes of text that I need to
>> index. Each record is about 5 paragraphs of text. I am currently
>> using TSearch2 (stemming and etc) and getting sub-optimal
>> results. Queries take more than a second to execute. Has anybody
>> implemented such a database using multiple systems or some special
>> add-on to TSearch2 to make things faster? I want to do something
>> like partitioning the data into multiple systems and merging the
>> ranked results at some master node. Is something like this
>> possible for PostgreSQL or must it be a software solution?
>>
>> Benjamin
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>> choose an index scan if your joining column's datatypes do not
>> match
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
From | Date | Subject | |
---|---|---|---|
Next Message | Pranjal Karwal | 2007-03-21 16:02:37 | can't trace error!!! |
Previous Message | Tom Lane | 2007-03-21 16:00:41 | Re: [HACKERS] Remove add_missing_from_clause? |