Re: Reliability with RAID 10 SSD and Streaming Replication

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Cuong Hoang <climbingrose(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Reliability with RAID 10 SSD and Streaming Replication
Date: 2013-05-16 22:39:49
Message-ID: CAMkU=1x6dE-seD8o0V0GvGg6q+cpU_C1AXfMVYHtyTg=LJcw=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, May 16, 2013 at 11:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Thu, May 16, 2013 at 1:34 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > On Thu, May 16, 2013 at 7:46 AM, Cuong Hoang <climbingrose(at)gmail(dot)com>
> wrote:
> >>
> >> Hi all,
> >>
> >> Our application is heavy write and IO utilisation has been the problem
> for
> >> us for a while. We've decided to use RAID 10 of 4x500GB Samsung 840 Pro
> for
> >> the master server. I'm aware of write cache issue on SSDs in case of
> power
> >> loss. However, our hosting provider doesn't offer any other choices of
> SSD
> >> drives with supercapacitor. To minimise risk, we will also set up
> another
> >> RAID 10 SAS in streaming replication mode. For our application, a few
> >> seconds of data loss is acceptable.
> >>
> >> My question is, would corrupted data files on the primary server affect
> >> the streaming standby? In other word, is this setup acceptable in terms
> of
> >> minimising deficiency of SSDs?
> >
> >
> >
> > That seems rather scary to me for two reasons.
> >
> > If the data center has a sudden power failure, why would it not take out
> > both machines either simultaneously or in short succession? Can you
> verify
> > that the hosting provider does not have them on the same UPS (or even
> worse,
> > as two virtual machines on the same physical host)?
>
> I took it to mean that his standby's "raid 10 SAS" meant disk drive
> based standby.

I had not considered that. If the master can't keep up with IO using disk
drives, wouldn't a replica using them probably fall infinitely far behind
trying to keep up with the workload?

Maybe the best choice would just be stick with the current set-up (one
server, spinning rust) and just turn off synchrounous_commit, since he is
already willing to take the loss of a few seconds of transactions.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Cuong Hoang 2013-05-16 23:52:00 Re: Reliability with RAID 10 SSD and Streaming Replication
Previous Message Merlin Moncure 2013-05-16 18:46:02 Re: Reliability with RAID 10 SSD and Streaming Replication