Re: scoring differences between bitmasks

From: "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>
To: Ben <bench(at)silentmedia(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: scoring differences between bitmasks
Date: 2005-10-02 17:59:12
Message-ID: 43401FF0.9020907@blackducksoftware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Try breaking the vector into 4 bigint columns and building a multi-column
index, with index columns going from the most evenly distributed to the
least. Depending on the distribution of your data, you may only need 2
or 3 columns in the index. If you can cluster the table in that order,
it should be really fast. (This structure is a tabular form of a linked
trie.)

-- todd

Ben wrote:
> Yes, that's the straightforward way to do it. But given that my vectors
> are 256 bits in length, and that I'm going to eventually have about 4
> million of them to search through, I was hoping greater minds than mine
> had figured out how to do it faster, or how compute some kind of
> indexing....... somehow.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2005-10-02 18:00:52 Re: Portable PostgreSQL
Previous Message Nirmalya Lahiri 2005-10-02 17:34:19 Re: Broken pipe