Re: how to calibrate the cost model parameters

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, jian xu <jamesxu(at)outlook(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: how to calibrate the cost model parameters
Date: 2022-05-31 06:46:29
Message-ID: 1ec517a25fd893f783e3bd4606eccf036b0c6696.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, 2022-05-30 at 17:12 -0400, Jeff Janes wrote:
> On Mon, May 30, 2022 at 3:40 PM jian xu <jamesxu(at)outlook(dot)com> wrote:
> > Thanks Laurenz.  It is ok that these values are "arbitrary scale", but we still need
> > to calibrate them to get the correct"arbitrary scale". For example if seq_page_cost
> > is 1 and cpu_tuple_cost is 0.01, how to verify the io seq cost is 100 times slower than cpu access cost?
> > "enable_seqscan" and "enable_nestloop" can work in some cases, but it doesn’t work with any cases.
> > Does anyone have experience to  calibrate the cost model parameters to get the correct value? Thanks
>
> I put a lot of time into it, and I think you will find that there is no consistent way to do so.
> For cpu_tuple_cost, for example, are the hint bits already set?  Do they need to get set?
> Are the clog pages still in memory?  Is there contention on the lock used to determine if a given
> transaction is still running?  How many columns does the table have? How many of them need to be
> accessed for the current case, and how far left and right are they in the table and are they toasted?
>
> For seq reads, how much contention is there for the buffer mapping partition lock?  What is the
> throughput of your IO system?  How many other seq reads will be occuring at the same time? Etc.
>
> For any change you intend to make, do you have a good enough load generator and test system set
> up so you can test that it doesn't make something else worse?

+1

I think that the values of those parameters are more set by experience and tradition than
by measurable physical evidence. In a way, it just happens to work (mostly).
For example, who says that on a spinning disk, random I/O is four times as slow as
sequential I/O? Very likely, you have to factor in that part of the I/O requests are
satisfied from the kernel page cache.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2022-05-31 14:44:14 Re: how to calibrate the cost model parameters
Previous Message Jeff Janes 2022-05-30 21:12:06 Re: how to calibrate the cost model parameters