Re: Capacitors, etc., in hard drives and SSD for DBMS machines...

From: "Wes Vaske (wvaske)" <wvaske(at)micron(dot)com>
To: Levente Birta <blevi(dot)linux(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Capacitors, etc., in hard drives and SSD for DBMS machines...
Date: 2016-07-08 14:50:26
Message-ID: 1467989418640.93960@micron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Why all this concern about how long a disk (or SSD) drive can stay up
> after a power failure?

When we're discussing SSD power loss protection, it's not a question of how long the drive can stay up but whether data at rest or data in flight are going to be lost/corrupted in the event of a power loss.

There are a couple big reasons for this.

1. NAND write latency is actually somewhat poor.

SSDs are comprised of NAND chips, DRAM for cache, and the controller. If the SSD disabled its disk cache, the write latencies under moderate load would move from the sub 100 microseconds range to the 1-10 milliseconds range. This is due to how the SSD writes to NAND. A single write operation takes a fairly large amount of time but large blocks cans be written as a single operation.

2. Garbage Collection

If you're not familiar with GC, I definitely recommend reading up as it's one of the defining characteristics of SSDs (and now SMR HDDs). The basic principle is that SSDs don't support a modification to a page (8KB). Instead, the contents would need to be erased then written. Additionally, the slice of the chip that can be read, written, or erased are not the same size for each operation. Erase Blocks are much bigger than the page (eg: 2MB vs 8KB). This means that to modify an 8KB page, the entire 2MB erase block needs to be read to the disk cache, erased, then written with the new 8KB page along with the rest of the existing data in the 2MB erase block.

This operation needs to be power loss protected (it's the operation that the Crucial drives protect against). If it's not, then the data that is read to cache could be lost or corrupted if power is lost during the operation. The data in the erase block is not necessarily related to the page being modified and could be anywhere else in the filesystem. *IMPORTANT: This is data at rest that may have been written years prior. It is not just new data that may be lost if a GC operation can not complete.*

TL;DR: Many SSDs will not disable disk cache even if you give the command to do so. Full Power Loss Protection at the drive level should be a requirement for any Enterprise or Data Center application to ensure no data loss or corruption of data at rest.

This is why there is so much concern with the internals to specific SSDs regarding behavior in a power loss event. It can have large impacts on the reliability of the entire system.

Wes Vaske | Senior Storage Solutions Engineer
Micron Technology

________________________________________
From: pgsql-performance-owner(at)postgresql(dot)org <pgsql-performance-owner(at)postgresql(dot)org> on behalf of Levente Birta <blevi(dot)linux(at)gmail(dot)com>
Sent: Friday, July 8, 2016 5:36 AM
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Capacitors, etc., in hard drives and SSD for DBMS machines...

On 08/07/2016 13:23, Jean-David Beyer wrote:
> Why all this concern about how long a disk (or SSD) drive can stay up
> after a power failure?
>
> It seems to me that anyone interested in maintaining an important
> database would have suitable backup power on their entire systems,
> including the disk drives, so they could coast over any power loss.
>
> I do not have any database that important, but my machine has an APC
> Smart-UPS that has 2 1/2 hours of backup time with relatively new
> batteries in it. It is so oversize because my previous computer used
> much more power than this one does. And if my power company has a brown
> out or black out of over 7 seconds, my natural gas fueled backup
> generator picks up the load very quickly.
>
> Am I overlooking something?
>

UPS-es can fail too ... :)

And so many things could be happen ... once I plugged out the power cord
from the UPS which powered the database server (which was a production
server) ... I thought powering something else :)
but lucky me ... the controller was flash backed

--
Levi

--
Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jean-David Beyer 2016-07-08 14:56:41 Re: Capacitors, etc., in hard drives and SSD for DBMS machines...
Previous Message Karl Denninger 2016-07-08 13:27:37 Re: Capacitors, etc., in hard drives and SSD for DBMS machines...