Quick Links

Re: Index Scans become Seq Scans after VACUUM ANALYSE

From:	mlw <markw(at)mohawksoft(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Lockhart <thomas(at)fourpalms(dot)org>, Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Louis-David Mitterrand <vindex(at)apartia(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date:	2002-04-17 16:35:13
Message-ID:	3CBDA441.5F490463@mohawksoft.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
>
> mlw <markw(at)mohawksoft(dot)com> writes:
> > On borderline conditions, wrongly using an index does not result in as bad
> > performance as wrongly not using an index,
>
> You're arguing from a completely false premise. It might be true on the
> particular cases you've looked at, but in general an indexscan-based
> plan can be many times worse than a seqscan plan.

OK, I'll grant you that, but I am talking about the space between when it is
clear that an index is useful and when it is clear that it is not. For some
reason you seem to think I am saying "always use an index," when, in fact, I am
saying more preference should be given to using an index than it currently has.

> As for "borderline conditions", how is the planner supposed to know what
> is borderline?

It need not know about borderline conditions.

>
> I cannot see any rational justification for putting a thumb on the
> scales on the side of indexscan (or any other specific plan type)
> as you've proposed. Thomas correctly points out that you'll just move
> the planner failures from one area to another.

I don't think this is true, and you yourself had said you are not too worried
about a 10 vs 8 second difference. I have seen many instances of when
PostgreSQL refuses to use an index because the data distribution is uneven.
Making it more difficult for the planer to ignore an index would solve
practically all the problems I have seen, and I bet the range of instances
where it would incorrectly use an index would not impact performance as badly
as those instances where it doesn't.

>
> If we can identify a reason why the planner tends to overestimate the
> costs of indexscan vs seqscan, by all means let's fix that. But let's
> not derive cost estimates that are the best we know how to make and
> then ignore them.

I don't think you can solve this with statistics. It is a far more complex
problem than that. There are too many variables, there is no way a standardized
summation will accurately characterize all possible tables. There must be a way
to add heuristics to the cost based analyzer.

In response to

Re: Index Scans become Seq Scans after VACUUM ANALYSE at 2002-04-17 16:10:28 from Tom Lane

Responses

Re: Index Scans become Seq Scans after VACUUM ANALYSE at 2002-04-17 17:45:44 from Tom Lane
Re: Index Scans become Seq Scans after VACUUM ANALYSE at 2002-04-17 18:08:58 from Andrew Sullivan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Hannu Krosing	2002-04-17 17:15:40	Re: Index Scans become Seq Scans after VACUUM ANALYSE
Previous Message	Tom Lane	2002-04-17 16:14:47	Re: huge runtime difference between 2 almost identical queries (was: Re: Index Scans become Seq Scans after VACUUM ANALYSE)