Quick Links

Re: scoring differences between bitmasks

From:	"Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>
To:	Ben <bench(at)silentmedia(dot)com>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org>
Subject:	Re: scoring differences between bitmasks
Date:	2005-10-02 17:59:12
Message-ID:	43401FF0.9020907@blackducksoftware.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi,

Try breaking the vector into 4 bigint columns and building a multi-column
index, with index columns going from the most evenly distributed to the
least. Depending on the distribution of your data, you may only need 2
or 3 columns in the index. If you can cluster the table in that order,
it should be really fast. (This structure is a tabular form of a linked
trie.)

-- todd

Ben wrote:
> Yes, that's the straightforward way to do it. But given that my vectors
> are 256 bits in length, and that I'm going to eventually have about 4
> million of them to search through, I was hoping greater minds than mine
> had figured out how to do it faster, or how compute some kind of
> indexing....... somehow.

In response to

Re: scoring differences between bitmasks at 2005-10-02 15:28:53 from Ben

Responses

Re: scoring differences between bitmasks at 2005-10-02 18:02:19 from Ben

Browse pgsql-general by date

	From	Date	Subject
Next Message	Marc G. Fournier	2005-10-02 18:00:52	Re: Portable PostgreSQL
Previous Message	Nirmalya Lahiri	2005-10-02 17:34:19	Re: Broken pipe