Re: how to calibrate the cost model parameters

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: how to calibrate the cost model parameters
Date: 2022-06-02 06:17:16
Message-ID: bf2c942c-03bc-1c03-b983-1ee416570699@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Manually (aka through a cron script) ANALYZE those seven tables on a regular
basis.  During the business day, I ANALYZE some tables *every two hours*.

On 5/31/22 21:20, jian xu wrote:
>
> Thanks everyone. The reason I asked this question is, my system has a
> complex query, which uses CTE, then join 7 other tables. It also has where
> clause using json data filter. Sometimes it runs slowly(5 minutes), but
> sometimes it runs fast(2 seconds), (query text is same , no parameter). I
> know it is because data distribution change. The table is  updated
> frequently.
>
> I compared the execution plan, the fast query joins other tables first,
> then join CTE, however, the slow question joins CTE first, then join other
> tables. I analyzed the tables, it didn’t work, I increased the column
> sampling from 100 to 10000 for all the columns used in where clause,
> didn’t work. I set the random io cost to 1, it didn’t work.
>
> I am not able to modify the query. The only solution I find is
>
> 1. use |join_collapse_limit|to force the join order, however, it has
> limitation, I need to modify the code to set it before submitting the
> query, and set it back to original value after query run, in case it
> didn’t change back, other query will affect.
> 2. Set both sequence and random io cost to 0.1, it can always generate
> correct plan.
>
> So I suspect the cost model parameters are not fully optimized on my
> system, the fast and slow execution plan cost might be very close, so a
> little bit data change will cause the plan change, and generated a wrong plan.
>
> Thanks
>
> James
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
>
> *From: *Tom Lane <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>
> *Sent: *Tuesday, May 31, 2022 10:44 AM
> *To: *Laurenz Albe <mailto:laurenz(dot)albe(at)cybertec(dot)at>
> *Cc: *Jeff Janes <mailto:jeff(dot)janes(at)gmail(dot)com>; jian xu
> <mailto:jamesxu(at)outlook(dot)com>; pgsql-admin(at)postgresql(dot)org
> *Subject: *Re: how to calibrate the cost model parameters
>
> Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> writes:
> > I think that the values of those parameters are more set by experience
> and tradition than
> > by measurable physical evidence.  In a way, it just happens to work
> (mostly).
> > For example, who says that on a spinning disk, random I/O is four times
> as slow as
> > sequential I/O?  Very likely, you have to factor in that part of the I/O
> requests are
> > satisfied from the kernel page cache.
>
> The 4.0 value *is* based on some real evidence --- many years ago, I put
> in a lot of machine time to get an estimate of that for non-cached fetches
> on the hardware I was using at the time.  If somebody did that over again
> today, they'd likely get a somewhat different value, even assuming they
> were still using spinning-rust storage.  But it doesn't matter too much.
> A value around that is probably fine for rotating storage, while if you
> are using SSD it's likely better to use something around 1.0, and beyond
> that any inaccuracies in the cost constants are usually swamped by
> imperfections of the cost models and other data.  For instance, it's not
> that useful to sweat about what the right value is for cpu_operator_cost
> when we've never spent any serious effort on assigning plausible procost
> factors to different functions.  (Worse, the whole notion that a given
> function has a fixed cost independent of its input data is sadly
> inadequate.)
>
>                         regards, tom lane
>

--
Angular momentum makes the world go 'round.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message jian xu 2022-06-02 12:07:23 RE: how to calibrate the cost model parameters
Previous Message Wells Oliver 2022-06-01 05:18:05 Re: Casting json (or jsonb) to real