>>> On Tue, Nov 13, 2007 at 2:05 PM, in message
<20071113200508(dot)GX11563(at)crankycanuck(dot)ca>, Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
wrote:
> On Tue, Nov 13, 2007 at 02:36:14PM -0500, Greg Smith wrote:
>
>> I see this as similar to the old optimizer hint argument, where there
>> certainly exist some edge cases where people know something the optimizer
>> doesn't which changes the optimal behavior.
>
> . . .the abuse of such hints in applications I have seen is so rampant as to
> make me doubt the utility of adding them anyway. It's true that by adding
> hints, you give a facility to a good, competent designer who has a really
> peculiar case that no general purpose system is likely to solve well. In
> practice, however, it also seems to mean that every slack-jawed fool with
> access to the manual thinks that he or she is going to "fix" the "broken"
> query plan by forcing index scans where they're useless (has a week yet gone
> by where someone doesn't post to -performance with that problem?). So I'm
> divided on whether actually providing the facility is a good idea, even
> though I can think of a handful of cases where I doubt even the smartest
> planner will get it right. (By analogy, pinning in memory, and I'm
> similarly divided.)
I have trouble not seeing the point of any posts in this thread.
Under our old, commercial database product, we had performance
problems we addressed with a "named caches" feature -- you could
declare a named cache of a particular size, and tweak some
characteristics of it, then bind objects to it. We came up with
several theories of how we could use them to improve on the default
LRU logic, and carefully tested. About half of these ideas made
things worse; about half made things better. We used only the ones
that made things better for us with our usage patterns. Part of
this involved using a cache small enough to fully contain all of
the heavily referenced tables we bound to it.
The proof of the benefit was that occasionally these settings got
lost through errors in machine builds or upgrades. The users would
start calling immediately, complaining about the performance; they
were happy again when we restored the named cache configurations.
The lack of such tuning knobs made me more than a little nervous as
we moved toward switching to PostgreSQL, and I'm not sure that we
couldn't use them if they were available; but, PostgreSQL performs
so much better overall that it would be minimal compared to the
improvement we saw switching to PostgreSQL.
This leave me with sympathy for the concern from the original post,
but feeling that I should join the crowd suggesting that its best
to proceed on the assumption that such a tuning feature probably
isn't needed: proceed without it and post any actual performance
problems for advice. If you can kludge heavier caching for the
objects in question and show an improvement in the metric which
matters for your purposes, perhaps you can convince people it's a
feature worth having, but expect that people will want to see
details and explore alternative solutions.
-Kevin