Array indexes, GIN?

From: Adam L Beberg <beberg(at)mithral(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Array indexes, GIN?
Date: 2007-03-02 03:59:04
Message-ID: 45E7A108.9010802@mithral.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I need to cross reference 2 tables. There are O(10M) A's, each has an
ordered set of 10 of the O(100K) B's associated with it. The dominant
query will be finding the A's and their count associated with a given
list of ~1k B's i.e. if 2 of the listed B's are in A's set of 10, it's
(A,2), and we should get back ~100K rows. The good news is we only need
to run this brutal query every couple minutes, but the row updates will
flow fast.

Luckily this is PostgreSQL, so the simple solution seems to be

CREATE TABLE xref( A bigint, B bigint[10] ); -- A is primary key

which cuts down the table overhead. O(10M) rows w/array.

On the surface, looks like a job for GIN, but GIN seems undocumented,
specifically mentions it doesn't support the deletes we'll have many of
since it's designed for word searching apparently, the performance
implications are undocumented. I searched, I read, and even IRC'd, and
it seems like GIN is just not used much.

Is GIN right? Will this work at all? Will it run fast enough to function?

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Leangen 2007-03-02 04:00:30 Improving query performance
Previous Message Alex Deucher 2007-03-02 03:16:37 Re: strange performance regression between 7.4 and 8.1