Re: What about utility to calculate planner cost constants?

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: What about utility to calculate planner cost constants?
Date: 2005-03-22 21:28:18
Message-ID: 87is3j9xql.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Christopher Browne <cbbrowne(at)acm(dot)org> writes:
> > Martha Stewart called it a Good Thing when gsstark(at)mit(dot)edu (Greg Stark) wrote:
> >> It's just a linear algebra problem with a bunch of independent
> >> variables and a system of equations. Solving for values for all of
> >> them is a straightforward problem.
>
> > Are you certain it's a linear system? I'm not.
>
> I'm quite certain it isn't a linear system, because the planner's cost
> models include nonlinear equations.

The equations will all be linear for the *_cost variables. If they weren't
they would be meaningless, the units would be wrong. Things like caching are
just going to be the linear factors that determine how many random page costs
and sequential page costs to charge the query.

> While I don't have a whole lot of hard evidence to back this up, my
> belief is that our worst problems stem not from bad parameter values
> but from wrong models.

I think these are orthogonal issues.

The time spent in real-world operations like random page accesses, sequential
page accesses, cpu operations, index lookups, etc, are all measurable
quantities. They can be directly measured or approximated by looking at the
resulting net times. Measuring these things instead of asking the user to
provide them is just a nicer user experience.

Separately, plugging these values into more and more accurate model will come
up with better estimates for how many of these operations a query will need to
perform.

> Anyway, I see little point in trying to build an automatic parameter
> optimizer until we have cost models whose parameters are more stable
> than the current ones.

Well what people currently do is tweak the physical values until the produce
results for their work load that match reality. It would be neat if postgres
could do this automatically.

Arguably the more accurate the cost model the less of a motivation for
automatic adjustments there is since you could easily plug in accurate values
from the hardware specs. But actually I think it'll always be a nice feature.

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-03-22 21:48:15 Re: What about utility to calculate planner cost constants?
Previous Message Mark Kirkwood 2005-03-22 21:26:16 Re: PostgreSQL on Solaris 8 and ufs