Re: Indexes on Large Tables

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Donny Drummonds <donny(at)cypresstg(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Indexes on Large Tables
Date: 2003-02-07 05:13:15
Message-ID: Pine.NEB.4.51.0302071408260.369@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, 3 Feb 2003, Donny Drummonds wrote:

> If I do not index the column from the where clause the query returns
> the 150,000 rows in 4 and a half minutes. If in do index the column
> from the where clause using a btree the 150,000 rows return in 11 and
> a half minutes.
>
> Any insight would be greatly appreciated.

Well, I'm not sure if this is the insight you're looking for, but....

The reason it takes longer if you use an index is that you change
from sequential I/O (which is relatively fast) to random I/O (which
is relatively slow). With the table scan (reading the entire table in
whatever order it's in on the disk) you're reading several times as much
data, but you're not doing head seeks all over the place to move the
head to the place where the next bit of data to be read is.

Obviously, in this case, even though an index was available, the planner
was wrong to chose to use it rather than just read the entire table.
That is, as someone else mentioned, likely due to bad statistics: the
planner thought you were going to select a very small part of the table,
rather than ten percent of it (which is a pretty large fraction, for
these purposes). Try doing an ANALYZE.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2003-02-07 06:34:16 Re: Problems upgrading from 7.1.3
Previous Message Steve Crawford 2003-02-06 21:56:49 Re: