Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:
> Because the query often only wants a small subset of the data, and
> whole relations are rarely fully cached, it's not enough to know
> that "some of relation X is cached", it has to know if the cached
> parts are the parts that'll be required, or at least an
> approximation of that. It sounds horrendously complicated to keep
> track of to me, and in the end it won't make query execution any
> faster, it'll just potentially help the planner pick a better
> plan. I wonder if that'd be worth the extra CPU time spent
> managing the cache and cache content stats, and using those cache
> stats when planning? It'd be an interesting experiment, but the
> outcome is hardly obvious.
I agree with that, but I think there's an even more insidious issue
here. Biasing plans heavily toward using what is already in cache
could have a destabilizing effect on performance. Let's say that
some query or maintenance skews the cache toward some plan which is
much slower when cached than another plan would be if cached. Let's
also postulate that this query runs very frequently. It will always
settle for what's fastest *this* time, not what would make for
fastest performance if consistently used. If it never chooses the
plan which would run better if cached, the data used for that plan
may never make it into cache, and you will limp along with the
inferior plan forever.
If you set the overall level of caching you expect, the optimizer
will tend to wind up with data cached to support the optimal plans
for that level of caching for the frequently run queries.
-Kevin