Re: Better index stategy for many fields with few values

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: Oscar Picasso <oscgoogle(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Better index stategy for many fields with few values
Date: 2006-04-12 12:59:32
Message-ID: 443CF9B4.2080009@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, Oscar,

Oscar Picasso wrote:

> [ all the 20 paramXX columns are used in the query}

> How can I optimize this kind of query?

PostgreSQL 8.1 has so-called bitmap index scans, which can combine
several index scans before actually accessing the data.

So I think it's best to create an index on each of the paramXX columns,
and see with EXPLAIN ANALYZE what it is doing.

> I was thinking about using a multicolumns index, but I have read that
> we should limit multicolumns indice to at most 2 or 3 columns.

Yes, that's true, the index overhead gets too high.

> If that's true then 22 columns for a multicolumn incdex seems way too
> much. Or maybe it is workable as every column uses only a very limited
> set of values?

Yes, I think that a 22 column index is way too much, especially with the
new bitmap index scans available.

> I was also thinking about about using a functional index.

If there's a logical relation between those values that they can easily
combined, that may be a good alternative.

I just had another weird idea:

As your paramXX values can have only 10 parameters, it also might be
feasible to use a bunch of 10 conditional indices, like:

CREATE INDEX foo1 ON table (param1, param2 WHERE param0='1st value';
CREATE INDEX foo2 ON table (param1, param2 WHERE param0='2nd value';
CREATE INDEX foo3 ON table (param1, param2 WHERE param0='3rd value';
[...]

This way, you don't have the index bloat of a 3-column index, but 10
2-column indices that cover 1/10th of the table each.

For 22 columns, you'd need a bunch of seven such indices plus a
single-column one, or can use some 3+1 and some 2+1 column index.

I'd like to see the query plans from explain analyze.

Btw, I expect query planning time to get rather significant for so much
columns, so gequo tuning, tuning work_mem (for the bitmap scans) and
prepared statements will pay off.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Harry Hehl 2006-04-12 13:04:28 Re: Sequencial scan instead of using index
Previous Message PFC 2006-04-12 07:22:52 Re: FOREIGN KEYS vs PERFORMANCE