Re: Changing the default random_page_cost value

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Changing the default random_page_cost value
Date: 2024-10-25 01:29:14
Message-ID: CAApHDvqNxnsirEv7NiTn_ZghFDjinHSaFJuTqx19M8a04WWmQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 25 Oct 2024 at 13:14, Greg Sabino Mullane <htamfids(at)gmail(dot)com> wrote:
>
> On Mon, Oct 14, 2024 at 10:20 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>>
>> Yeah, I think any effort to change the default value for this setting would require some analysis to prove that the newly proposed default
>> is a more suitable setting than the current default. I mean, why 1.2 and not 1.1 or 1.3? Where's the evidence that 1.2 is the best value
>> for this?
>
> As I said, I was just throwing that 1.2 number out there. It felt right, although perhaps a tad high (which seems right as we keep things very conservative). I agree we should make a best effort to have an accurate, defendable default. We all know (I hope) that 4.0 is wrong for SSDs.

I don't think we're going to find the correct new value for this
setting by throwing randomly chosen numbers at each other on an email
thread. Unfortunately, someone is going to have to do some work to
figure out what the number should be, and then hopefully someone else
can verify that work to check that person is correct.

I'm not trying to be smart or funny here, but I just am failing to
comprehend why you think you offering a number without any information
about how you selected that number to set as the new default
random_page_cost would be acceptable. Are you expecting someone else
to go and do the work to prove that your selected number is the
correct one? It's been 4 weeks since your first email and nobody has
done that yet, so maybe you might need to consider other ways to
achieve your goal.

>> I don't think just providing evidence that random read times are closer to sequential read times on SSDs are closer than they are with
>> HDDs is going to be enough.
>
> ...
>>
>> It would be nice to have this as a script so that other people could easily run it on their hardware to ensure that random_page_cost we
>> choose as the new default is representative of the average hardware.
>
>
> Heh, this is starting to feel like belling the cat (see https://fablesofaesop.com/belling-the-cat.html)

I don't see the similarity. Changing the default random_page_cost
requires analysis to find what the new default should be. The
execution of the actual change in default is dead simple. With
belling the cat, it seems like the execution is the hard part and
nobody is debating the idea itself.

> Remember this is still just a default, and we should encourage people to tweak it themselves based on their own workloads. I just want people to start in the right neighborhood. I'll see about working on some more research / generating a script, but help from others is more than welcome.

You might be mistakenly thinking that the best random_page_cost is an
exact ratio of how much slower a random seek and read is from a
sequential read. There are unfortunately many other factors to
consider. The correct setting is going to be the one where the chosen
plan uses the scan method that's the fastest and knowing the answer to
that is going to take some benchmarks on PostgreSQL. Our cost model
simply just isn't perfect enough for you to assume that I/O is the
only factor that changes between an Index Scan and a Seq Scan.

I'd say it's not overly difficult to come up with test cases that go
to prove the value you select is "correct". I've done this before for
CPU-related costs. I think with I/O the main difference will be that
your tests should be much larger, and doing that will mean getting the
results takes much more time. Here's a link to some analysis I did to
help solve a problem relating to partition-wise aggregates [1]. Maybe
you can use a similar method to determine random_page_cost.

David

[1] https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-25 01:44:31 Re: Docs Build in CI failing with "failed to load external entity"
Previous Message Richard Guo 2024-10-25 01:09:11 Re: Wrong results with grouping sets