| From: | "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com> | 
|---|---|
| To: | Ben <bench(at)silentmedia(dot)com> | 
| Cc: | Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Re: scoring differences between bitmasks | 
| Date: | 2005-10-02 17:59:12 | 
| Message-ID: | 43401FF0.9020907@blackducksoftware.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Hi,
Try breaking the vector into 4 bigint columns and building a multi-column
index, with index columns going from the most evenly distributed to the
least.  Depending on the distribution of your data, you may only need 2
or 3 columns in the index.  If you can cluster the table in that order,
it should be really fast.  (This structure is a tabular form of a linked
trie.)
-- todd
Ben wrote:
> Yes, that's the straightforward way to do it. But given that my  vectors 
> are 256 bits in length, and that I'm going to eventually have  about 4 
> million of them to search through, I was hoping greater minds  than mine 
> had figured out how to do it faster, or how compute some  kind of 
> indexing....... somehow.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marc G. Fournier | 2005-10-02 18:00:52 | Re: Portable PostgreSQL | 
| Previous Message | Nirmalya Lahiri | 2005-10-02 17:34:19 | Re: Broken pipe |