Re: index scan on =, but not < ?

From: John Arbash Meinel <john(at)arbash-meinel(dot)com>
To: Rick Schumeyer <rschumeyer(at)ieee(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: index scan on =, but not < ?
Date: 2005-03-08 19:01:20
Message-ID: 422DF680.1000409@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Rick Schumeyer wrote:

> I have two index questions. The first is about an issue that has been
> recently discussed,
>
> and I just wanted to be sure of my understanding. Functions like
> count(), max(), etc. will
>
> use sequential scans instead of index scans because the index doesn’t
> know which rows
>
> are actually visible…is this correct?
>
Actually, index scans are chosen whenever the cost is expected to be
cheaper than a sequential scan. This is generally about < 10% of the
total number of rows.

> Second:
>
> I created an index in a table with over 10 million rows.
>
> The index is on field x, which is a double.
>
> The following command, as I expected, results in an index scan:
>
> =# explain select * from data where x = 0;
>
> QUERY PLAN
>
> -------------------------------------------------------------------------
>
> Index Scan using data_x_ix on data (cost=0.00..78.25 rows=19 width=34)
>
> Index Cond: (x = 0::double precision)
>
> (2 rows)
>
Since you have 10m rows, when it expects to get only 19 rows, it is much
faster to use an index.

> But this command, in which the only difference if > instead of =, is a
> sequential scan.
>
> =# explain select * from data where x > 0;
>
> QUERY PLAN
>
> ------------------------------------------------------------------
>
> Seq Scan on data (cost=0.00..1722605.20 rows=62350411 width=34)
>
> Filter: (x > 0::double precision)
>
> (2 rows)
>
Here, pg expects to find 62M rows (you must have significantly more than
10M rows). In this case a sequential scan is much faster than an indexed
one, so that's what pg does.

> Why is this?
>
> (This is with pg 8.0.1 on a PC running FC3 with 1GB ram…if it matters)
>
If you think there is truly a performance problem, try attaching the
results of "explain analyze" in which we might be able to tell you that
your statistics inaccurate (run vacuum analyze if you haven't).

John
=:->

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Schumeyer 2005-03-08 19:02:50 Re: index scan on =, but not < ?
Previous Message Thomas F.O'Connell 2005-03-08 18:52:41 Re: index scan on =, but not < ?