Re: Yet another "Why won't PostgreSQL use my index?"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Gregory Wood" <gregw(at)com-stock(dot)com>
Cc: "Manfred Koizar" <mkoi-pg(at)aon(dot)at>, "PostgreSQL-General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Yet another "Why won't PostgreSQL use my index?"
Date: 2002-06-20 20:44:59
Message-ID: 14146.1024605899@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Gregory Wood" <gregw(at)com-stock(dot)com> writes:
> Of course if PostgreSQL were estimating the number of rows correctly, that
> would be less of a problem. Seems that our data is throwing off the
> statistics... we have some values that appear tens of thousands of times and
> others that appear only a few times, with a few values (such as the example
> I sent) in between. Perhaps it's time to look at TABLE SET STATISTICS...

I believe that pushing the SET STATISTICS target up to 50 or so would
solve the problem nicely, at the cost of making ANALYZE run longer.

However, it also bothered me that your actual runtime ratio was nearly
500:1 when the rows estimation was off "only" 36:1. There's still an
estimation error of more than a factor of 10 in there, and that can't be
explained by arguing about the appropriate value of random_page_cost.
(random_page_cost less than one is certainly nonsensical.)

I'm wondering whether the indexscan case was benefiting from pages
having been read into memory by the preceding seqscan. If you run the
seqscan plan twice in a row, does the runtime stay about the same?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2002-06-20 20:52:25 Re: selecting all records where a column is null
Previous Message terry 2002-06-20 20:40:38 Re: selecting all records where a column is null