Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID

From: "Holger Hoffstaette" <holger(dot)hoffstaette(at)googlemail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID
Date: 2013-05-21 11:27:08
Message-ID: pan.2013.05.21.11.27.08.144987@googlemail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 21 May 2013 11:40:55 +1000, Toby Corkindale wrote:

>>> While it is important to let the SSD know about space that can be
>>> reclaimed, I gather the operation does not perform well. I *think*
>>> current advice is to leave 'discard' off the mount options, and instead
>>> run a nightly cron job to call 'fstrim' on the mount point instead. (In
>>> really high write situations, you'd be looking at calling that every
>>> hour instead I suppose)

This is still a good idea - see below.

>> The guy who blogged about this a couple of years ago was using a
>> Sandforce controller drive.

Btw that doesn't mean anything (neither in terms of performance nor
stability), since "the controller" also needs to be paired with an - often
vendor-dependent - firmware, which is much more relevant. Since LSI
acquired Sandforce this situation has gotten much better (unified
upstream).

>> I'm not sure there is a similar issue with other drives. Certainly we've

There is (now), because..

>> never noticed a problematic delay in file deletes. That said, our
>> applications don't delete files too often (log file purging is probably
>> the only place it happens regularly).
>>
>> Personally, in the absence of a clear and present issue, I'd prefer to
>> go the "kernel guys and drive firmware guys will take care of this"
>> route, and just enable discard on the mount.

Nope, wrong, because.. (..getting there :)

> That is from 2011 though, so you're right that things may have improved by
> now.. Has anyone seen benchmarks supporting that though?

Unfortunately since 3.8 discards are issued as synchronous commands,
effectively disabling any scheduling/merging etc. The result can be seen
easily:

- mount drive without discard using kernel >= 3.8
- unpack kernel source
- time delete of entire tree

- remount with discard
- unpack kernel tree
- start delete of tree
- ...
- check it hasn't crashed
- ...
- go plant a tree or make babies while waiting for it to finish

Online discard has gotten so slow that it's now a good idea to turn off
for anything but light write workloads. Metadata-heavy writes are
obviously the worst case.

I experienced this on Samsung, Intel & a Sandforce-based drives, so "the
controller" is no longer the primary reason for the performance impact.
Extremely enterprisey drives *might* behave slightly better, but I doubt
it; flash erase cycles are what they are.

-h

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adarsh Sharma 2013-05-21 11:48:30 Re: WAL contains references to invalid pages
Previous Message Daniel Migowski 2013-05-21 07:55:01 Re: Cross compile custom data types for Linux and windows