indexing array columns

From: Rajarshi Guha <rguha(at)indiana(dot)edu>
To: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: indexing array columns
Date: 2007-04-13 22:09:50
Message-ID: 1176502190.6880.8.camel@panda
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi, I have a table of about 10M rows. It has two columns A and B, where
A is a text field and B is a real[12] field.

Now when this table is queried it is usually of the form:

select A from theTable where sim(B, C) > 0.8

Now C will be a 12-element array specified by the user and the value 0.8
can be arbitrary. The function, sim(), is essentially a similarity
function which simply computes the inverse of the Manhattan distance
between the query array and the rows of the array column. Right now the
above query uses a seq scan.

Furthermore, the values of the individual array elements for any given
row can vary from 0 to infinity (but for most cases will be numbers less
than 1000)

My question is: how can I index the column B so that such queries are
fast.

I realize that my table is essentially a collection of 12-dimensional
points and that I could replace my similarity function with a distance
function.

Thus my query boils down to asking 'find me rows of the table that are
within X distance of my query'

I know that the GIS community deals with 2D points, but I'm not familiar
with this area and if I understand correctly, they use Euclidean
distances, where as I need Manhattan distances.

What type of indexing, available in Postgres could be used for my
problem? Would it require me to implement my own indexing scheme?

Any pointers would be appreciated

Thanks,

-------------------------------------------------------------------
Rajarshi Guha <rguha(at)indiana(dot)edu>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Every little picofarad has a nanohenry all its own.
-- Don Vonada

Responses

Browse pgsql-general by date

  From Date Subject
Next Message sixtus 2007-04-13 22:29:08 Trigger on transaction?
Previous Message Andrej Ricnik-Bay 2007-04-13 21:46:04 Re: question