Quick Links

Re: scoring differences between bitmasks

From:	"Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>
To:	Ben <bench(at)silentmedia(dot)com>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org>
Subject:	Re: scoring differences between bitmasks
Date:	2005-10-02 18:44:37
Message-ID:	43402A95.5090701@blackducksoftware.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi,

It may be that I don't understand your problem. :)

Are you searching the table for the closest vector? If so, is
"closeness" defined only as the number of bits that are different?
Or, do you need to know which bits as well?

-- todd

Ben wrote:
> Hrm, I don't understand. Can you give me an example with some
> reasonably sized vectors?
>
> On Oct 2, 2005, at 10:59 AM, Todd A. Cook wrote:
>
>> Hi,
>>
>> Try breaking the vector into 4 bigint columns and building a multi-
>> column
>> index, with index columns going from the most evenly distributed to the
>> least. Depending on the distribution of your data, you may only need 2
>> or 3 columns in the index. If you can cluster the table in that order,
>> it should be really fast. (This structure is a tabular form of a linked
>> trie.)
>>
>> -- todd
>>
>>
>> Ben wrote:
>>
>>> Yes, that's the straightforward way to do it. But given that my
>>> vectors are 256 bits in length, and that I'm going to eventually
>>> have about 4 million of them to search through, I was hoping
>>> greater minds than mine had figured out how to do it faster, or how
>>> compute some kind of indexing....... somehow.

In response to

Re: scoring differences between bitmasks at 2005-10-02 18:02:19 from Ben

Responses

Re: scoring differences between bitmasks at 2005-10-02 18:49:03 from Ben

Browse pgsql-general by date

	From	Date	Subject
Next Message	Ben	2005-10-02 18:49:03	Re: scoring differences between bitmasks
Previous Message	David Fetter	2005-10-02 18:36:59	Re: Multiple database queries