Re: Reliability with RAID 10 SSD and Streaming Replication

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Reliability with RAID 10 SSD and Streaming Replication
Date: 2013-05-24 12:11:53
Message-ID: 519F5909.7070000@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5/22/13 2:45 PM, Shaun Thomas wrote:
> That read rate and that throughput suggest 8k reads. The queue size is
> 270+, which is pretty high for a single device, even when it's an SSD.
> Some SSDs seem to break down on queue sizes over 4, and 15 sectors
> spread across a read queue of 270 is pretty hash. The drive tested here
> basically fell over on servicing a huge diverse read queue, which
> suggests a firmware issue.

That's basically it. I don't know that I'd put the blame specifically
onto a firmware issue without further evidence that's the case though.
The last time I chased down a SSD performance issue like this it ended
up being a Linux scheduler bug. One thing I plan to do for future SSD
tests is to try and replicate this issue better, starting by increasing
the number of clients to at least 300.

Related: if anyone read my "Seeking PostgreSQL" talk last year, some of
my Intel 320 results there were understating the drive's worst-case
performance due to a testing setup error. I have a blog entry talking
about what was wrong and how it slipped past me at
http://highperfpostgres.com/2013/05/seeking-revisited-intel-320-series-and-ncq/

With that loose end sorted, I'll be kicking off a brand new round of SSD
tests on a 24 core server here soon. All those will appear on my blog.
The 320 drive is returning as the bang for buck champ, along with a DC
S3700 and a Seagate 1TB Hybrid drive with NAND durable write cache.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2013-05-24 12:21:31 Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
Previous Message Scott Marlowe 2013-05-24 06:16:16 Re: [PERFORM] Very slow inner join query Unacceptable latency.