Re: Index usage on date feild , differences in '>' and '>='

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Index usage on date feild , differences in '>' and '>='
Date: 2002-09-19 14:28:01
Message-ID: 20020919072108.S27151-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Thu, 19 Sep 2002, Rajesh Kumar Mallah. wrote:

> Thanks very much for the response.
> set enable_seqscan=off; Definitely helps.
> and for wide date ranges it usees indexes.
>
>
> But with default value of enable_sequence changing date range seems to have effect.
> can you explain me a bit more or point me to right documents for understanding
> the languae of "EXPLAIN".

> ====================================================================
> EXPLAIN
> tradein_clients=> explain select list_id from eyp_rfi a where generated between '2002-09-11' and '2002-09-19' ;
> NOTICE: QUERY PLAN:
>
> Seq Scan on eyp_rfi a (cost=0.00..17923.81 rows=12924 width=4)
>
> EXPLAIN
> tradein_clients=> explain select list_id from eyp_rfi a where generated between '2002-09-12' and '2002-09-19' ;
> NOTICE: QUERY PLAN:
>
> Index Scan using eyp_rfi_date on eyp_rfi a (cost=0.00..17369.05 rows=12220 width=4)
>
> EXPLAIN

Note the cost and row estimates for the two queries and for the first
query with and without enable_seqscan=off; My guess is that if you
do an explain with it off, you'll get a cost number >17923.81 which
is why it's picking the seq scan, becaust it's guessing that it'll
be faster.

The row count seems off by a factor of 2 from the numbers below, have
you analyzed recently and how many rows are in the table as a whole?
Also, I'm guessing that your data is probably well ordered in relation to
those dates as opposed to truly random which may throw off the estimates
as well, what version are you using?

set enable_seqscan=off is kinda dangerous since sometimes the seq scan is
actually faster.

> the distribution of values are as follows:
> > select generated ,count(generated) from eyp_rfi a where generated between '2002-09-10' and '2002-09-19' group by generated;
>
>
> generated | count
> ------------+-------
> 2002-09-10 | 442
> 2002-09-11 | 1060
> 2002-09-12 | 641
> 2002-09-13 | 607
> 2002-09-14 | 1320
> 2002-09-15 | 521
> 2002-09-16 | 1474
> 2002-09-17 | 940
> 2002-09-18 | 1005
> 2002-09-19 | 178
> (10 rows)
>
> Last Question , Shud i do "enable_seqscan=off" in Application Level? I use Perl-DBI

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2002-09-19 14:31:52 Re: Stripping white-space in SELECT statments
Previous Message CN LIOU 2002-09-19 14:11:13 How To Get Bytea Data Instead Of Its Oid