Re: Setting effective_io_concurrency in VM?

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Don Seiler <don(at)seiler(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Setting effective_io_concurrency in VM?
Date: 2017-12-08 04:51:08
Message-ID: 3b52abb7-37d5-d18c-d677-9f5052ffd4df@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 28/11/17 07:40, Scott Marlowe wrote:

> On Mon, Nov 27, 2017 at 11:23 AM, Don Seiler <don(at)seiler(dot)us> wrote:
>> Good afternoon.
>>
>> We run Postgres (currently 9.2, upgrading to 9.6 shortly) in VMWare ESX
>> machines. We currently have effective_io_concurrency set to the default of
>> 1. I'm told that the data volume is a RAID 6 with 14 data drives and 2
>> parity drives. I know that RAID10 is recommended, just working with what
>> I've inherited for now (storage is high-end HP 3Par and HP recommended RAID
>> 6 for best performance).
>>
>> Anyway, I'm wondering if, in a virtualized environment with a VM datastore,
>> it makes sense to set effective_io_concurrency closer to the number of data
>> drives?
>>
>> I'd also be interested in hearing how others have configured their
>> PostgreSQL instances for VMs (if there's anything special to think about).
> Generally VMs are never going to be as fast as running on bare metal
> etc. You can adjust it and test it with something simple like pgbench
> with various settings for -c (concurrency) and see where it peaks etc
> with the setting. This will at least get you into the ball park.
>
> A while back we needed fast machines with LOTS of storage (7TB data
> drives with 5TB of data on them) and the only way to stuff that many
> 800GB SSDs into a single machine was to use RAID-5 with a spare (I
> lobbied for RAID6 but was overidden eh...) We were able to achieve
> over 15k TPS in pgbench with a 400GB data store on those boxes. The
> secret was to turn off the cache in the RAID controller and cranl up
> effective io concurrency to something around 10 (not sure, it's been a
> while).
>
> tl;dr: Only way to know is to benchmark it. I'd guess that somewhere
> between 10 and 20 is going to get the best throughput but that's just
> a guess. Benchmark it and let us know!

Reasonably modern Linux hosts with Linux guests using Libvirt/KVM should
be able to get bare metal performance for moderate numbers of cpus (<=8
last time we benchmarked). It certainly *used* to be the case that
virtualization sucked for databases, but not so much now.

The advice to benhmark, however - is golden :-)

Cheers

Mark

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Kretschmer 2017-12-08 07:20:40 Re: Table with large number of int columns, very slow COPY FROM
Previous Message Alex Tokarev 2017-12-08 04:21:45 Table with large number of int columns, very slow COPY FROM