Re: WAL & ZFS

From: Rui DeSousa <rui(at)crazybean(dot)net>
To: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
Cc: pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: WAL & ZFS
Date: 2022-04-01 20:23:29
Message-ID: E43F5F43-456E-459D-B0D5-680F599097F9@crazybean.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

> On Apr 1, 2022, at 1:56 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com> wrote:
>
>> On Apr 1, 2022, at 11:49 AM, Rui DeSousa <rui(at)crazybean(dot)net> wrote:
>>
>> If you’re using RAIDZ# then performance is going to be heavily impacted and I would highly recommend NOT using RAIDZ# for a database server.
>
> Actually, I found even the performance of RAIDZ1 to be acceptable after appropriate configuration--current versions, lz4, etc.
>

It might be for a low iops system; however, I would still recommend against it. I haven’t used RAIDZ in years; it might be good for an archive system but I don’t see the value of it in a production database server. You also have to account for drive failures and replacement time. A replacement in a RAIDZ configuration is much more expense than replacing a disk in a mirrored set. Disks today are larger as well and the risk of another failure during a rebuild is exponential increased thus the need for RAIDZ2 and RAIDZ3.

Personally and for logical reasons I would build a RAIDZ in powers of 2; i.e. 2, 4, 8 drives plus parity and then have a stripe set of RAIDZ2. So the first option would require 4 drives (2D + 2P) and would have same storage as a RAID10 configuration; however, the RAID10 would perform better under load. The 4+p option seems to be the sweet spot as the rebuild times on larger sets are not worth it nor is it worth spreading out 128k over 8 drives - of course one could use a larger record size, but would you want to? For me 128k/16k is only 8 database blocks; reminds me of using Oracle’s readahead=8 option :).

Note: raidz does not alway stripe across all drives in the set like in a traditional raid set. i.e. It might only use 2+p instead of 8+p as configured — it depends on the size of the current ZFS record size being written out and free space.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Ribe 2022-04-01 20:43:26 Re: WAL & ZFS
Previous Message Scott Ribe 2022-04-01 17:56:58 Re: WAL & ZFS