From: | "Mason Hale" <masonhale(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | bloom filter indexes? |
Date: | 2008-06-03 14:43:25 |
Message-ID: | 8bca3aa10806030743h24a18d4dq334cf91f66aa635d@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I've been working on partitioning a rather large dataset into multiple
tables. One limitation I've run into the lack of cross-partition-table
unique indexes. In my case I need to guarantee the uniqueness of a
two-column pair across all partitions -- and this value is not used to
partition the tables. The table is partitioned based on a insert date
timestamp.
To check the uniqueness of this value I've added an insert/update
trigger to search for matches in the other partitions. This trigger is
adding significant overhead to inserts and updates.
This sort of 'membership test' where I need only need to know if the
key exists in the table is a perfect match for bloom filter. (see:
http://en.wikipedia.org/wiki/Bloom_filter)
The Bloom filter can give false positives so using it alone won't
provide the uniqueness check I need, but it should greatly speed up
this process.
Searching around for "postgresql bloom filter" I found this message
from 2005 along the same lines:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01475.php
This thread indicates bloom filters are used in the intarray contrib
module and the tsearch2 (and I assume the built-in 8.3 full-text
search features).
I also found this assignment for CS course at the University of
Toronto, when entails using bloom filters to speed up large joins:
http://queens.db.toronto.edu/~koudas/courses/cscd43/hw2.pdf
So, my question: are there any general-purpose bloom filter
implementations for postgresql? I'm particularly interested
implementations that would be useful for partitioned tables. Is anyone
working on something like this?
thanks,
- Mason
From | Date | Subject | |
---|---|---|---|
Next Message | PJ | 2008-06-03 16:14:43 | E_PARSE error ? |
Previous Message | Stephan Szabo | 2008-06-03 14:39:59 | Re: join ... using ... and - is this expected behaviour? |