Re: Postgresql 9.4 and ZFS?

From: Joseph Kregloh <jkregloh(at)sproutloud(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Benjamin Smith <lists(at)benjamindsmith(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgresql 9.4 and ZFS?
Date: 2015-10-02 01:04:34
Message-ID: CAAW2xfcVktisiJvXsJWb8gh6kYNso2_mr-fiwD8gHgQhZ9Rwpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Oct 1, 2015 at 5:51 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 10/1/15 8:50 AM, Joseph Kregloh wrote:
>
>> In my testing with pgbench I actually saw a decrease in performance with
>> a ZIL enabled. I ended up just keeping the L2ARC and dropping the. ZIL
>> will not provide you with any speed boost as a database. On a NAS with
>> NFS shared for example, a ZIL would work well. ZIL is more for data
>> protection than anything.
>>
>> I run in Production FreeBSD 10.1 with an NVMe mirror for L2ARC, the rest
>> of the storage is spinning drives. With a combination of filesystem
>> compressions. For example, archival tablespaces and the log folder are
>> on gzip compression on an external array. Faster stuff like the xlog are
>> lz4 and on an internal array.
>>
>
> I'm not a ZFS expert, but my understanding is that a ZIL *that has lower
> latency than main storage* can be a performance win. This is similar to the
> idea of giving pg_xlog it's own dedicated volume so that it's not competing
> with all the other IO traffic every time you do a COMMIT.
>
> Recent versions of Postgres go to a lot of trouble to make fsync as
> painless as possible, so a ZIL might not help much in many cases. Where it
> could still help is if you're running synchronous_commit = true and you
> consistently get lower latency on the ZIL than on the vdev's; that will
> make every COMMIT run faster.
>
> (BTW, this is all based on the assumption that ZFS treats fsync as a
> synchronous request.)

The ZIL or ZFS Intent Log as the name describe is just a log. It just
replays transactions that may have been lost in the event of machine
failure. If the machine crashes upon startup of ZFS it will replay the data
stored in the ZIL drive and try to fix any errors. During runtime the ZIL
is never read from only written to.

When there is no separate ZIL device. With a synchronous write ZFS will
store the data on RAM and the ZIL residing on the vdev. Once it
acknowledges that the data is all there it will flush from RAM into it's
final write location on the vdev.

When you have a fast ZIL device like an SSD or NVMe drive. It will do the
same store the data on RAM and on the fast ZIL device. Once acknowledge it
will also write from RAM into the vdev. In theory it does give you a faster
acknowledgement time.

In either case you are still "bottlenecked" by the speed of the write from
RAM to the zpool. Now for a small database with not many writes a ZIL would
be awesome. But on a write heavy database you will be acknowledging more
writes because of the ZIL that what you are physically able to write from
RAM to zpool, thereby degrading performance.

At least this is how it works in my head.

-Joseph Kregloh

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steven Lembark 2015-10-02 13:26:12 Effecient time ranges in 9.4/9.5?
Previous Message Jonathan Vanasco 2015-10-01 23:48:47 "global" & shared sequences