Re: Multiple indexes or multi-column index?

From: "Marc Mitchell" <marcm(at)eisolution(dot)com>
To: "Jeff Boes" <jboes(at)nexcerpt(dot)com>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Multiple indexes or multi-column index?
Date: 2002-08-13 21:00:17
Message-ID: 016001c2430c$6dba7540$6e01050a@eisolution.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

As has already been said, in order to use the index, the left-most index
columns must be supplied as part of your qualification. But, beyond the
basics of when an index can or cannot be used, I would say you should
consider three additional issues: the distribution of values for any of
these columns, the effects of joining from index to base table, and the
volatility of the columns being indexed.

For example, if foo.a has potentially 100,000 different values then there
would seem to be little value in indexing the remaining three columns since
with a value for foo.a alone, you'll quickly narrow your request to a
handful of rows. Extra columns just make the index entries themselves
wider. The one benefit is that, by including columns within an index, even
if not used as part of the index access strategy, can eliminate the need to
join back to the base table itself if all the columns requested are within
the index itself. HOWEVER: while I know this is the case with other RDBMS
products, I can't verify this within Postgres. If I do a

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

it runs the exact EXPLAIN results (aside from result with) as:

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

so I'm not sure if the join from index to base table is being avoided or
not in the former case.

Furthermore, balance the ordering of columns between what is required for
index applicability based on the queries you run AND the distribution of
values for the columns you are indexing. If, for example, all queries will
supply at least foo.a and foo.b, whichever of these two columns has the
fewer rows per given value should be listed first.

You should also consider the volatility of the columns you are including
within an index. While the effect on inserts is basically the same (aside
from the width issue), adding extra columns to an index that are updated a
lot is not as desirable as extra columns that don't change.

Hope this is useful.

Marc Mitchell - Senior Application Architect
Enterprise Information Solutions, Inc.
Downers Grove, IL 60515
marcm(at)eisolution(dot)com

----- Original Message -----
From: "Jeff Boes" <jboes(at)nexcerpt(dot)com>
To: <pgsql-admin(at)postgresql(dot)org>
Sent: Tuesday, August 13, 2002 3:14 PM
Subject: [ADMIN] Multiple indexes or multi-column index?

> Given a large (100,000+ rows) table such as:
>
> CREATE TABLE foo (
> a INTEGER,
> b INTEGER,
> c TIMESTAMP,
> d TEXT
> /* possibly several more columns in here */
> );
>
> What are the pros and cons of these two different approaches to indexes?
>
> CREATE huge_index ON foo (a,b,c,d);
>
> vs.
>
> CREATE a_index ON foo (a);
> CREATE b_index ON foo (b);
> CREATE c_index ON foo (c);
> CREATE d_index ON foo (d);
>
> --
> Jeff Boes vox 269.226.9550 ext 24
> Database Engineer fax 269.349.9076
> Nexcerpt, Inc. http://www.nexcerpt.com
> ...Nexcerpt... Extend your Expertise
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2002-08-13 21:02:47 Re: performance tuning: shared_buffers, sort_mem; swap
Previous Message Jeff Boes 2002-08-13 20:14:31 Multiple indexes or multi-column index?