Re: Postgresql 9.4 and ZFS?

From: Patric Bechtel <patric(dot)bechtel(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql 9.4 and ZFS?
Date: 2015-09-30 13:45:43
Message-ID: 560BE787.5060005@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Tomas,

Tomas Vondra schrieb am 30.09.2015 um 14:01:
> Hi,
>
> On 09/30/2015 12:21 AM, Patric Bechtel wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>
>> Hi Benjamin,
>>
>> if you're using compression, forget about that. You need to synchronize the ashift value to
>> the internal rowsize of you SSD, that's it. Make sure your SSD doesn't lie to you regarding
>> writing blocks and their respective order. In that case you might even choose to set
>> sync=disabled. Also, set atime=off and relatime=on. For faster snapshot transfers, you might
>> like to set the checksum algo to SHA256.
>
> What is "SSD rowsize". Do you mean size of the internal pages?

Yep. In my experience, it helps write performance a lot. At least over extended period of time
(less write amplification).

> FWIW I've been doing extensive benchmarking of ZFS (on Linux), including tests of different
> ashift values, and I see pretty much no difference between ashift=12 and ashift=13 (4k vs 8k).
>
> To show some numbers, these are pgbench results with 16 clients:
>
> type scale ashift=12 ashift=13 rsize=8k logbias
> ---------------------------------------------------------------- ro small 53097
> 53159 53696 53221 ro medium 42869 43112 47039 46952 ro
> large 3127 3108 27736 28027 rw small 6593 6301
> 6384 6753 rw medium 1902 1890 4639 5034 rw large
> 561 554 2168 2585
>
> small=150MB, medium=2GB, large=16GB (on a machine with 8GB of RAM)
>
> The tests are "adding" the features, i.e. the columns are actually:
>
> * ashift=12 * ashift=13 * ashift=13 + recordsize=8kB * ashift=13 + recordsize=8kB +
> logbias=throughput
>
> I've also done a few runs with compression, but that reduces the performance a bit
> (understandably).

I'm somewhat surprised by the influence of the rsize value. I will recheck that. In my case, the
compression actually improved throughput quite a bit, but that might change depending on CPU speed
vs IO speed. Our CPU's are quite powerful, but the SSD are just SATA Samsung/OCZ models at least
18 months old. Also, I measured the write performance over several hours, to push the internal gc
of the SSD to its limits. We had some problems in the past with (e.g. Intel) SSD's and their
behaviour (<1MB/s), so that's why I put some emphasis on that.

>>
>> As always, put zfs.conf into /etc/modprobe.d with
>>
>> options spl spl_kmem_cache_slab_limit=16384 options zfs zfs_arc_max=8589934592
>>
>> you might want to adjust the zfs_arc_max value to your liking. Don't set it to more than 1/3
>> ofyour RAM, just saying.
>
> Why? My understanding is that ARC cache is ~ page cache, although implemented differently and
> not as tightly integrated with the kernel, but it should release the memory when needed and
> such. Perhaps not letting it to use all the RAM is a good idea, but 1/3 seems a bit too
> aggressive?

First of all: The setting is somewhat 'disregarded' by zfs, as it's the net size of the buffer.
The gross side (with padding and aligning) isn't counted there, so in fact the cache fills up to
2/3 of the memory, which is plenty enough. Also, sometimes the arc shrinking process isn't as fast
as necessary, so leaving some headroom in case isn't a bad strategy, IMHO.

Patric
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: GnuPT 2.5.2

iEYEARECAAYFAlYL54cACgkQfGgGu8y7ypBXKACg6fuuvzdUtDvHRbdyisJXZwxF
ORMAoK3mEQhsB+AybHTQzhZ6hR6xT+30
=9yFi
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Benjamin Smith 2015-09-30 17:33:18 Re: Postgresql 9.4 and ZFS?
Previous Message Tomas Vondra 2015-09-30 12:22:31 Re: Postgresql 9.4 and ZFS?