From: | "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com> |
---|---|
To: | Ben <bench(at)silentmedia(dot)com> |
Cc: | Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: scoring differences between bitmasks |
Date: | 2005-10-02 17:59:12 |
Message-ID: | 43401FF0.9020907@blackducksoftware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
Try breaking the vector into 4 bigint columns and building a multi-column
index, with index columns going from the most evenly distributed to the
least. Depending on the distribution of your data, you may only need 2
or 3 columns in the index. If you can cluster the table in that order,
it should be really fast. (This structure is a tabular form of a linked
trie.)
-- todd
Ben wrote:
> Yes, that's the straightforward way to do it. But given that my vectors
> are 256 bits in length, and that I'm going to eventually have about 4
> million of them to search through, I was hoping greater minds than mine
> had figured out how to do it faster, or how compute some kind of
> indexing....... somehow.
From | Date | Subject | |
---|---|---|---|
Next Message | Marc G. Fournier | 2005-10-02 18:00:52 | Re: Portable PostgreSQL |
Previous Message | Nirmalya Lahiri | 2005-10-02 17:34:19 | Re: Broken pipe |