Re: explanation for random_page_cost is outdated

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: yigong hu <yigongh(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, splarv(at)ya(dot)ru, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: explanation for random_page_cost is outdated
Date: 2020-04-27 04:02:41
Message-ID: CAFj8pRA35t++NeQ2mbT=3O9=o7M0hrsRZUmPOCRErZR7Am5A_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

ne 26. 4. 2020 v 21:25 odesílatel yigong hu <yigongh(at)gmail(dot)com> napsal:

> Sorry to hijack the thread, I also recently have similar observation that
> the statement about random_page_cost on SSD is ambiguous. The current
> document says that
>
> > Storage that has a low random read cost relative to sequential, e.g.
> solid-state drives, might also be better modeled with a lower value for
> random_page_cost.
>
> However, this statement does not clarify what values might be good. For
> some workload, the default value 4.0 would cause bad performance and
> lowering random_page_cost to a value 3.0 or 2.0 does not solve the
> performance problem. Only when the random_page_cost is lowered to below 1.2
> will the bad performance be mitigated. Thus, I would suggest elaborating on
> this description further as:
>
> > Storage that has a low random read cost relative to sequential, e.g.
> solid-state drives, might also be better modeled with a value that is close
> to 1 for random_page_cost.
>

I depends on estimation. Lot of people use random_page_cost as fix of
broken estimation. Then configures this value to some strange values. Lot
of other queries with good estimation can be worse then.

> Detail:
>
> I run the PostgreSQL 11 on an SSD hardware. The database has two small
> tables with 6MB and 16MB separately. The pgbench runs a select join query
> in 1 min. The result shows that when the random_page_cost is 1, the average
> latency is 14ms. When the random_page_cost is 1.5, 2, 3 or 4, the average
> latency is 26ms. This result suggests that setting random_page_cost to a
> value larger than 1.5 would cause almost 2x latency. If I increase the 6MB
> table to 60MB and rerun the sysbench, the result shows that when the
> random_page_cost is 1, the average latency is 13ms. When the
> random_page_cost is 1.5,2,3 or 4, the average latency is 17ms.
>
> I attached my testing script, the postgresql configuration file, and
> planner output.
>
> On Mon, Apr 27, 2020 at 3:19 AM Олег Самойлов <splarv(at)ya(dot)ru> wrote:
>
>> Yep. Unclear. What parameter is recommended for SSD? Lower? 3? 2? 1?
>>
>> Much better will be write: if you use SSD set 1.
>>
>> Олег
>>
>> > 19 марта 2020 г., в 23:56, Bruce Momjian <bruce(at)momjian(dot)us> написал(а):
>> >
>> > On Thu, Feb 27, 2020 at 02:48:44PM +0000, PG Doc comments form wrote:
>> >> The following documentation comment has been logged on the website:
>> >>
>> >> Page: https://www.postgresql.org/docs/12/runtime-config-query.html
>> >> Description:
>> >>
>> >> Explanation for random_page_cost is rather outdated, because it did
>> only for
>> >> case of mechanical hdd. But all modern database servers, which I know,
>> made
>> >> upon SSD. Do or not do default value for random_page_cost equal to 1
>> is the
>> >> question, but, IMHO, at list in the documentation about
>> random_page_cost
>> >> need to add in a speculation about SSD.
>> >>
>> >> It's important because a business programming now is mostly web
>> programming.
>> >> Most database is poorly designed by web programmer, tables looked like
>> a
>> >> primary key and a huge json (containing all) with large gin index upon
>> it.
>> >> Now I am seeing a table with a GIN index 50% of the table size. The
>> database
>> >> is on SSD, of cause. With default random_page_cost=4 GIN index don't
>> used
>> >> by planner, but with random_page_cost=1 the result may be not
>> excellent, but
>> >> acceptable for web programmers.
>> >
>> > Does this sentence in the random_page_cost docs unclear or not have
>> enough
>> > visibility:
>> >
>> >
>> https://www.postgresql.org/docs/12/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
>> >
>> > Storage that has a low random read cost relative to sequential, e.g.
>> > solid-state drives, might also be better modeled with a lower value
>> for
>> > random_page_cost.
>> >
>> > --
>> > Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
>> > EnterpriseDB https://enterprisedb.com
>> >
>> > + As you are, so once was I. As I am, so you will be. +
>> > + Ancient Roman grave inscription +
>>
>>
>>
>>
>>
>>

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Alexander Lakhin 2020-04-27 05:00:01 Re: Rendering pi more nicely in PDF
Previous Message yigong hu 2020-04-26 19:24:51 Re: explanation for random_page_cost is outdated