Quick Links

Re: Working with huge amount of data. RESULTS!

From:	Mario Lopez <mario(at)lar3d(dot)com>
To:	depesz(at)depesz(dot)com
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Working with huge amount of data. RESULTS!
Date:	2008-02-12 14:45:51
Message-ID:	47B1B11F.4010606@lar3d.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi!,

I optimized the LIKE 'keyword%' and LIKE '%keyword' with the following
results:

# time /Library/PostgreSQL8/bin/psql -U postgres -d testdb -c "select *
from table1 where varchar_reverse(data) like varchar_reverse('%keyword');"
real 0m0.055s
user 0m0.011s
sys 0m0.006s

# time /Library/PostgreSQL8/bin/psql -U postgres -d testdb -c "select *
from table1 where data like 'keyword%';"
real 0m0.026s
user 0m0.012s
sys 0m0.006s

It works flawlesly as you can see by the timings, take in consideration
that "table1" has 100 million records. The only problem is generating
the reversed index which takes like 20 minutes, I guess it has to do
with the plperl function, perhaps a C function for inverting would make
it up in less time.

The problem is still with the LIKE '%keyword%', my problem is that I am
not searching for Words in a dictionary fashion, suppose my "data" is
random garbage, that it has common consecutive bytes. How could I
generate a dictionary from this random garbage to make it easier for
indexing?

Thanks

> On Mon, Feb 11, 2008 at 04:37:24PM +0100, Mario Lopez wrote:
>
>> SELECT * FROM names WHERE name LIKE ‘keyword%’
>> Or
>> SELECT * FROM names WHERE name LIKE ‘%keyword%’
>>
>
> check this:
> http://www.depesz.com/index.php/2007/07/30/indexable-field-like-something/
> and this:
> http://www.depesz.com/index.php/2007/09/15/speeding-up-like-xxx/
>
> depesz
>
>

In response to

Re: Working with huge amount of data. at 2008-02-11 15:51:59 from hubert depesz lubaczewski

Responses

Re: Working with huge amount of data. RESULTS! at 2008-02-12 14:52:09 from Alvaro Herrera
Re: Working with huge amount of data. RESULTS! at 2008-02-12 14:58:01 from hubert depesz lubaczewski
Re: Working with huge amount of data. RESULTS! at 2008-02-12 15:15:49 from Oleg Bartunov

Browse pgsql-general by date

	From	Date	Subject
Next Message	Mark Cave-Ayland	2008-02-12 14:49:34	Re: Query using cursors using 100% CPU
Previous Message	Alvaro Herrera	2008-02-12 14:32:51	Re: deadlock while re-indexing table