Re: RAID and SSD configuration question

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: RAID and SSD configuration question
Date: 2015-10-20 19:09:45
Message-ID: CAHyXU0wu0+YgQOzZ3GHgqtNuA8AmOULVgy4nWZ2PvtTtJCL2FA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Oct 20, 2015 at 12:28 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
>> On Tue, Oct 20, 2015 at 9:33 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
>>> We're running LSI MegaRAIDs at work with 10 SSD RAID-5 arrays, and we
>>> can get ~5k to 7k tps on a -s 10000 pgbench with the write cache on.
>>>
>>> When we turn the write cache off, we get 15k to 20k tps. This is on a
>>> 120GB pgbench db that fits in memory, so it's all writes.
>>
>> This is my findings exactly. I'll double down on my statement;
>> caching raid controllers are essentially obsolete technology. They
>> are designed to solve a problem that simply doesn't exist any more
>> because of SSDs. Unless your database is very, very, busy it's pretty
>> hard to saturate a single low-mid tier SSD with zero engineering
>> effort. It's time to let go: spinning drives are obsolete in the
>> database world, at least in any scenario where you're measuring IOPS.
>
> Here's what's REALLY messed up. The older the firmware on the
> megaraid, the faster it ran with caching on. We had 3 to 4 year old
> firmware and were getting 7 to 8k tps. As we upgraded firmware it got
> all the way down to 3k tps, then the very latest got it back up to 4k
> or so. No matter what version of the firmware, turning off caching got
> us to 15 to 18k easy. So it appears more aggressive and complex
> caching algorithms just made things worse and worse.

Another plausible explanation is that they fixed edge case concurrency
issues in the firmware that came at the cost of performance,
invalidating the engineering trade-offs made against the cheapo cpu
they stuck on the controller next to the old, slow, 1GB dram.. Of
course, we'll never know because the source code is proprietary and
closed. I'll stick to mdadm, thanks.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vladimir Sitnikov 2015-10-20 19:10:16 Configurable length of application_name and/or read access to custom gucs of another backend
Previous Message Dane Foster 2015-10-20 17:45:51 Re: My first PL/pgSQL function