From: | Rui DeSousa <rui(at)crazybean(dot)net> |
---|---|
To: | Scott Ribe <scott_ribe(at)elevated-dev(dot)com> |
Cc: | pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WAL & ZFS |
Date: | 2022-04-01 20:23:29 |
Message-ID: | E43F5F43-456E-459D-B0D5-680F599097F9@crazybean.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
> On Apr 1, 2022, at 1:56 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com> wrote:
>
>> On Apr 1, 2022, at 11:49 AM, Rui DeSousa <rui(at)crazybean(dot)net> wrote:
>>
>> If you’re using RAIDZ# then performance is going to be heavily impacted and I would highly recommend NOT using RAIDZ# for a database server.
>
> Actually, I found even the performance of RAIDZ1 to be acceptable after appropriate configuration--current versions, lz4, etc.
>
It might be for a low iops system; however, I would still recommend against it. I haven’t used RAIDZ in years; it might be good for an archive system but I don’t see the value of it in a production database server. You also have to account for drive failures and replacement time. A replacement in a RAIDZ configuration is much more expense than replacing a disk in a mirrored set. Disks today are larger as well and the risk of another failure during a rebuild is exponential increased thus the need for RAIDZ2 and RAIDZ3.
Personally and for logical reasons I would build a RAIDZ in powers of 2; i.e. 2, 4, 8 drives plus parity and then have a stripe set of RAIDZ2. So the first option would require 4 drives (2D + 2P) and would have same storage as a RAID10 configuration; however, the RAID10 would perform better under load. The 4+p option seems to be the sweet spot as the rebuild times on larger sets are not worth it nor is it worth spreading out 128k over 8 drives - of course one could use a larger record size, but would you want to? For me 128k/16k is only 8 database blocks; reminds me of using Oracle’s readahead=8 option :).
Note: raidz does not alway stripe across all drives in the set like in a traditional raid set. i.e. It might only use 2+p instead of 8+p as configured — it depends on the size of the current ZFS record size being written out and free space.
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Ribe | 2022-04-01 20:43:26 | Re: WAL & ZFS |
Previous Message | Scott Ribe | 2022-04-01 17:56:58 | Re: WAL & ZFS |