From: | Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [WIP] cache estimates, cache access cost |
Date: | 2011-06-19 19:32:13 |
Message-ID: | BANLkTi=L=cOGEWDzHk90V4iMksSdzGFfFQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2011/6/19 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> 1. ANALYZE happens far too infrequently to believe that any data taken
>>> at ANALYZE time will still be relevant at execution time.
>>> 2. Using data gathered by ANALYZE will make plans less stable, and our
>>> users complain not infrequently about the plan instability we already
>>> have, therefore we should not add more.
>>> 3. Even if the data were accurate and did not cause plan stability, we
>>> have no evidence that using it will improve real-world performance.
>>
>> I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
>> our interface for telling postgres to gather stats and we generally
>> agree that having stats and modelling the system behaviour as
>> accurately as practical is the right direction so we need a specific
>> reason why this stat and this bit of modeling is a bad idea before we
>> dismiss it.
>>
>> I think the kernel of truth in these concerns is simply that
>> everything else ANALYZE looks at mutates only on DML. If you load the
>> same data into two databases and run ANALYZE you'll get (modulo random
>> sampling) the same stats. And if you never modify it and analyze it
>> again a week later you'll get the same stats again. So autovacuum can
>> guess when to run analyze based on the number of DML operations, it
>> can run it without regard to how busy the system is, and it can hold
>> off on running it if the data hasn't changed.
>>
>> In the case of the filesystem buffer cache the cached percentage will
>> vary over time regardless of whether the data changes. Plain select
>> queries will change it, even other activity outside the database will
>> change it. There are a bunch of strategies for mitigating this
>> problem: we might want to look at the cache situation more frequently,
>> discount the results we see since more aggressively, and possibly
>> maintain a kind of running average over time.
>>
>> There's another problem which I haven't seen mentioned. Because the
>> access method will affect the cache there's the possibility of
>> feedback loops. e.g. A freshly loaded system prefers sequential scans
>> for a given table because without the cache the seeks of random reads
>> are too expensive... causing it to never load that table into cache...
>> causing that table to never be cached and never switch to an index
>> method. It's possible there are mitigation strategies for this as well
>> such as keeping a running average over time and discounting the
>> estimates with some heuristic values.
>
> *scratches head*
>
> Well, yeah. I completely agree with you that these are the things we
> need to worry about. Maybe I did a bad job explaining myself, because
> ISTM you said my concerns were FUD and then went on to restate them in
> different words.
>
> I'm not bent out of shape about using ANALYZE to try to gather the
> information. That's probably a reasonable approach if it turns out we
> actually need to do it at all. I am not sure we do. What I've argued
> for in the past is that we start by estimating the percentage of the
> relation that will be cached based on its size relative to
> effective_cache_size, and allow the administrator to override the
> percentage on a per-relation basis if it turns out to be wrong. That
> would avoid all of these concerns and allow us to focus on the issue
> of how the caching percentages impact the choice of plan, and whether
> the plans that pop out are in fact better when you provide information
> on caching as input. If we have that facility in core, then people
> can write scripts or plug-in modules to do ALTER TABLE .. SET
> (caching_percentage = XYZ) every hour or so based on the sorts of
> statistics that Cedric is gathering here, and users will be able to
> experiment with a variety of algorithms and determine which ones work
> the best.
Robert, I am very surprised.
My patch does offer that.
1st, I used ANALYZE because it is the way to update pg_class I found.
You are suggesting ALTER TABLE instead, that is fine, but give me that
lock-free :) else we have the ahem.. Alvaro's pg_class_ng (I find this
one interesting because it will be lot easier to have different values
on standby server if we find a way to have pg_class_ng 'updatable' per
server)
So, as long as the value can be change without problem, I don't care
where it resides.
2nd, I provided the patches on the last CF, exactly to allow to go to
the exciting part: the cost-estimates changes. (after all, we can work
on the cost estimate, and if later we find a way to use ALTER TABLE or
pg_class_ng, just do it instead of via the ANALYZE magic)
3nd, you can right now write a plugin to set the value of rel_oscache
(exactly like the one you'll do for a ALTER TABLE SET reloscache...)
RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum)
{
float4 percent = 0;
/* if a plugin is present, let it manage things */
if (OSCache_hook)
percent = (*OSCache_hook) (relation, forkNum);
return percent;
}
Looks like the main fear is because I used the ANALYZE word...
PS: ANALYZE OSCACHE does *not* run with ANALYZE, those are distinct
operations. (ANALYZE won't do the job of ANALYZE OSCACHE, we can
discuss the grammar, maybe a ANALYZE ([OSCACHE], [DATA], ...) will be
better ).
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitriy Igrishin | 2011-06-19 19:36:55 | Re: Libpq enhancement |
Previous Message | Florian Pflug | 2011-06-19 19:29:31 | Re: Range Types and extensions |