Re: Will higher shared_buffers improve tpcb-like benchmarks?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Saurabh Nanda <saurabhnanda(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Will higher shared_buffers improve tpcb-like benchmarks?
Date: 2019-01-29 18:00:18
Message-ID: CAMkU=1xUq2H==+=TZttayVf=F3C+3iTfzp5PtAqosc9mS8j7bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Jan 29, 2019 at 6:39 AM Saurabh Nanda <saurabhnanda(at)gmail(dot)com>
wrote:

> Hi,
>
> I'm going crazy trying to optimise my Postgres config for a production
> setting [1] Once I realised random changes weren't getting my anywhere, I
> finally purchased PostgreSQL 10 - Higher Performance [2] and understood the
> impact of shared_buffers.
>
> IIUC, shared_buffers won't have any significant impact in the following
> scenario, right?
>
> -- DB size = 30GB
> -- shared_buffers = 2GB
> -- workload = tpcb-like
>
> This is because the tpcb-like workload selects & updates random rows from
> the DB [3]. Therefore, with a 2GB shared buffer, there is only a 6-7%
> chance (did I get my probability correct?) that the required data will be
> in the shared_buffer. Did I understand this correctly?
>

That is likely correct, but the data will likely be stored in the OS file
cache, so reading it from there will still be pretty fast.

>
> If nothing else becomes the bottleneck (eg. periodically writing dirty
> pages to disk), increasing the shared_buffers to 15GB+ should have a
> significant impact, for this DB-size and workload, right? (The system has
> 64 GB RAM)
>

About the only way to know for sure that writing dirty data is not the
bottleneck is to use a read only benchmark, such as the -S flag for
pgbench. And at that point, the IPC overhead between pgbench and the
backend, even when both are running on the same machine, is likely to be
the bottleneck. And after that, the bottleneck might shift to opening and
closing transactions and taking and releasing locks[1].

If you overcome that, then you might reliably see a difference between 2GB
and 15GB of shared buffers, because at 2GB each query to pgbench_accounts
is likely to fetch 2 pages into shared_buffers from the OS cache: the index
leaf page for pgbench_accounts_pkey, and the table page for
pgbench_accounts. At 15GB, the entire index should be reliably in
shared_buffers (after enough warm-up time), so you would only need to fetch
1 page, and often not even that.

Cheers,

Jeff

[1] I have a very old patch to pgbench that introduces a new query to
overcome this,
https://www.postgresql.org/message-id/BANLkTi%3DQBYOM%2Bzj%3DReQeiEKDyVpKUtHm6Q%40mail.gmail.com
. I don't know how much work it would be to get it to compile against
newer versions--I stopped maintaining it because it became too much work to
rebase it past conflicting work, and because I lost interest in this line
of research.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Saurabh Nanda 2019-01-29 18:10:53 Re: Will higher shared_buffers improve tpcb-like benchmarks?
Previous Message Bob Jolliffe 2019-01-29 15:42:57 Re: Interpreting shared_buffers setting