Re: Optimizing >= and <= for numbers and dates

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Dimitri Nagiev <dnagiev(at)gmx(dot)de>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Optimizing >= and <= for numbers and dates
Date: 2003-10-01 19:03:21
Message-ID: Pine.LNX.4.33.0310011300310.22235-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 1 Oct 2003, Dimitri Nagiev wrote:

> here goes the EXPLAIN ANALYZE output:
>
>
> template1=# VACUUM analyze mytable;
> VACUUM
> template1=# explain analyze select * from mytable where
> mydate>='2003-09-01';
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------
> Seq Scan on mytable (cost=0.00..2209.11 rows=22274 width=562) (actual
> time=0.06..267.30 rows=22677 loops=1)
> Filter: (mydate >= '2003-09-01'::date)
> Total runtime: 307.71 msec
> (3 rows)

How many rows are there in this table? If the number is only two or three
times as many as the number of rows returned (22677) then a seq scan is
preferable.

The way to tune your random_page_cost is to keep making your range more
selective until you get an index scan. Then, see what the difference is
in speed between the two queries that sit on either side of that number,
i.e. if a query that returns 1000 rows switches to index scan, and takes
100 msec, while one that returns 1050 uses seq scan and takes 200 msec,
then you might want to lower your random page cost.

Ideally, what should happen is that as the query returns more and more
rows, the switch to seq scan should happen so that it's taking about the
same amount of time as the index scan, maybe just a little more.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message scott.marlowe 2003-10-01 19:04:56 Re: Optimizing >= and <= for numbers and dates
Previous Message Neil Conway 2003-10-01 18:35:55 Re: Optimizing >= and <= for numbers and dates