Re: Working with huge amount of data. RESULTS!

From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Mario Lopez <mario(at)lar3d(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Working with huge amount of data. RESULTS!
Date: 2008-02-12 14:58:01
Message-ID: 20080212145801.GA5919@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Feb 12, 2008 at 03:45:51PM +0100, Mario Lopez wrote:
> the reversed index which takes like 20 minutes, I guess it has to do
> with the plperl function, perhaps a C function for inverting would make
> it up in less time.

sure. take a look at this:
http://www.depesz.com/index.php/2007/09/05/indexable-field-like-something-update/

> The problem is still with the LIKE '%keyword%', my problem is that I am
> not searching for Words in a dictionary fashion, suppose my "data" is
> random garbage, that it has common consecutive bytes. How could I
> generate a dictionary from this random garbage to make it easier for
> indexing?

if you are talking abount really general "random bytes", and really
general "random substrings" - you're basically out of luck.

but
think about acceptable limitations.
can you limit min-length of searched substring?
check pg_trgm in contrib.

as for generating dictionary - show me the data and perhaps i can think
of something.

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA. here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Chille 2008-02-12 15:13:33 Some Autovacuum Questions
Previous Message Alvaro Herrera 2008-02-12 14:52:09 Re: Working with huge amount of data. RESULTS!