Re: the big picture for index-only scans

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: the big picture for index-only scans
Date: 2011-05-11 14:22:49
Message-ID: 201105111422.p4BEMnI27616@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Cdric Villemain wrote:
> 2011/5/10 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> > Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> >> The typical speed up for non-covered indexes will come when we
> >> access a very large table (not in cache) via an index scan that is
> >> smaller than a bitmapindex scan. Will we be able to gauge
> >> selectivities sufficiently accurately to be able to pinpoint that
> >> during optimization? How will we know that the table is not in
> >> cache? Or is this an optimisation in the executor for a bitmapheap
> >> scan?
> >
> > I would continue to object to using current cache contents for plan
> > choice because of plan instability and the fact that an odd initial
> > cache load could skew plans in a bad direction indefinitely. ?I do
> > agree (and have already posted) that I think the hardest part of
> > this might be developing a good cost model. ?I doubt that's an
> > insoluble problem, especially since it is something we can refine
> > over time as we gain experience with the edge cases.
>
> you will have the same possible instability in planning with the
> index(-only?) scan because we may need to access heap anyway and this
> needs is based on estimation, or I miss something ? I understood the
> idea was just to bypass the heap access *if* we can for *this*
> heap-page.
>
> In reality, I am not really scared by plan instability because of a
> possible PG/OS cache estimation. The percentages remain stable in my
> observations ... I don't know yet how it will go for vis map.
>
> And, we already have plan instability currently, which is *good* : at
> some point a seq scan is better than an bitmap heap scan. Because the
> relation size change and because ANALYZE re-estimate the distribution
> of the data. I will be very happy to issue ANALYZE CACHE as I have to
> ANALYZE temp table for large query if it allows the planner to provide
> me the best plan in some scenario...but this is another topic, sorry
> for the digression..

Good point --- we would be making plan decisions based on the visibility
map coverage. The big question is whether visibility map changes are
more dynamic than the values we already plan against, like rows in the
table, table size, and value distributions. I don't know the answer.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-05-11 14:38:15 Re: the big picture for index-only scans
Previous Message Greg Stark 2011-05-11 14:22:31 Re: PGC_S_DEFAULT is inadequate