Re: Yet another "Why won't PostgreSQL use my index?"

From: "Gregory Wood" <gregw(at)com-stock(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Yet another "Why won't PostgreSQL use my index?"
Date: 2002-06-20 21:01:56
Message-ID: 003401c2189d$b79543a0$7889ffcc@comstock.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > Of course if PostgreSQL were estimating the number of rows correctly,
that
> > would be less of a problem. Seems that our data is throwing off the
> > statistics... we have some values that appear tens of thousands of times
and
> > others that appear only a few times, with a few values (such as the
example
> > I sent) in between. Perhaps it's time to look at TABLE SET STATISTICS...
>
> I believe that pushing the SET STATISTICS target up to 50 or so would
> solve the problem nicely, at the cost of making ANALYZE run longer.

Better that ANALYZE take a few seconds longer than my queries <grin>.

> However, it also bothered me that your actual runtime ratio was nearly
> 500:1 when the rows estimation was off "only" 36:1. There's still an
> estimation error of more than a factor of 10 in there, and that can't be
> explained by arguing about the appropriate value of random_page_cost.
> (random_page_cost less than one is certainly nonsensical.)
>
> I'm wondering whether the indexscan case was benefiting from pages
> having been read into memory by the preceding seqscan. If you run the
> seqscan plan twice in a row, does the runtime stay about the same?

Appears that way:

cns=# explain analyze select * from re_site_listings_index where
idx_siteid=387;
NOTICE: QUERY PLAN:

Seq Scan on re_site_listings_index (cost=0.00..42110.74 rows=13828
width=302) (actual time=2095.26..2095.26 rows=0 loops=1)
Total runtime: 2095.43 msec

EXPLAIN
cns=# explain analyze select * from re_site_listings_index where
idx_siteid=387;
NOTICE: QUERY PLAN:

Seq Scan on re_site_listings_index (cost=0.00..42110.74 rows=13828
width=302) (actual time=2095.12..2095.12 rows=0 loops=1)
Total runtime: 2095.24 msec

EXPLAIN
cns=# explain analyze select * from re_site_listings_index where
idx_siteid=387;
NOTICE: QUERY PLAN:

Seq Scan on re_site_listings_index (cost=0.00..42110.74 rows=13828
width=302) (actual time=2082.50..2082.50 rows=0 loops=1)
Total runtime: 2082.62 msec

EXPLAIN
cns=# explain analyze select * from re_site_listings_index where
idx_siteid=387;
NOTICE: QUERY PLAN:

Seq Scan on re_site_listings_index (cost=0.00..42110.74 rows=13828
width=302) (actual time=2057.49..2057.49 rows=0 loops=1)
Total runtime: 2057.60 msec

Greg

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-06-20 21:06:09 Re: Yet another "Why won't PostgreSQL use my index?"
Previous Message Joe Conway 2002-06-20 20:52:25 Re: selecting all records where a column is null