Re: Benchmarking: How to identify bottleneck (limiting factor) and achieve "linear scalability"?

From: Saurabh Nanda <saurabhnanda(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Benchmarking: How to identify bottleneck (limiting factor) and achieve "linear scalability"?
Date: 2019-01-26 11:10:55
Message-ID: CAPz=2oF8C-JAny9eM_gNkzQDj1CDjGfoQ17rcHm5-N-hOFAhUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

Thank you for replying.

> wal_sync_method=fsync
>
>
Why this change?

Actually, I re-checked and noticed that this config section was left to
it's default values, which is the following. Since the commented line said
`wal_sync_method = fsync`, I _assumed_ that's the default value. But it
seems that Linux uses fdatasync, by default, and the output of
pg_test_fsync also shows that it is /probably/ the fastest method on my
hardware.

# wal_sync_method = fsync
# the default is the first option supported by the operating system:
# open_datasync
# fdatasync (default on Linux)
# fsync
# fsync_writethrough
# open_sync

PGOPTIONS="-c synchronous_commit=off" pgbench -T 3600 -P 10 ....

I am currently running all my benchmarks with synchronous_commit=off and
will get back with my findings.

You could also try pg_test_fsync to get low-level information, to
> supplement the high level you get from pgbench.

Thanks for pointing me to this tool. never knew pg_test_fsync existed! I've
run `pg_test_fsync -s 60` two times and this is the output -
https://gist.github.com/saurabhnanda/b60e8cf69032b570c5b554eb50df64f8 I'm
not sure what to make of it? Shall I tweak the setting of `wal_sync_method`
to something other than the default value?

The effects of max_wal_size are going to depend on how you have IO
> configured, for example does pg_wal shared the same devices and controllers
> as the base data? It is mostly about controlling disk usage and
> crash-recovery performance, neither of which is of primary importance to
> pgbench performance.

The WAL and the data-directory reside on the same SSD disk -- is this a
bad idea? I was under the impression that smaller values for max_wal_size
cause pg-server to do "maintenance work" related to wal rotation, etc. more
frequently and would lead to lower pgbench performance.

Not all SSD are created equal, so the details here matter, both for the
> underlying drives and the raid controller.

Here's the relevant output from lshw --
https://gist.github.com/saurabhnanda/d7107d4ab1bb48e94e0a5e3ef96e7260 It
seems I have Micron SSDs. I tried finding more information on RAID but
couldn't get anything in the lshw or lspci output except the following --
`SATA controller: Intel Corporation Sunrise Point-H SATA controller [AHCI
mode] (rev 31)`. Moreover, the devices are showing up as /dev/md1,
/dev/md2, etc. So, if my understanding is correct, I don't think I'm on
hardware RAID, but software RAID, right?

These machines are from the EX-line of dedicated servers provided by
Hetzner, btw.

PS: Cc-ing the list back again because I assume you didn't intend for your
reply to be private, right?

-- Saurabh.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Saurabh Nanda 2019-01-27 07:39:16 Re: Benchmarking: How to identify bottleneck (limiting factor) and achieve "linear scalability"?
Previous Message Adrien NAYRAT 2019-01-26 10:59:57 Re: ERROR: found xmin from before relfrozenxid